Meganuclease variants cleaving a dna target sequence from the mouse rosa26 locus and uses thereof

ABSTRACT

An I-CreI variant, wherein one of the two I-CreI monomers has at least two substitutions, one in each of the two functional subdomains of the LAGLIDADG core domain situated respectively from positions 26 to 40 and 44 to 77 of I-CreI, said variant being able to cleave a DNA target sequence from the mouse ROSA26 locus. Use of said variant and derived products for the engineering of transgenic mice and recombinant mouse cell lines expressing an heterologous protein of interest.

The present application is a continuation of U.S. Ser. No. 12/663,164,filed Apr. 16, 2010, which is a National Stage of PCT/IB08/002,500,filed Jun. 6, 2008, and claims the benefit of PCT/IB07/002,830, filedJun. 6, 2007, the entire contents of these applications are incorporatedherein by reference.

The invention relates to a meganuclease variant cleaving a DNA targetsequence from the mouse ROSA26 locus, to a vector encoding said variant,to a cell, an animal or a plant modified by said vector and to the useof said meganuclease variant and derived products for mouse genomeengineering (recombinant protein production, construction of transgenicmice and recombinant mouse cell lines).

The mouse ROSA26 locus has been discovered by Friedrich and Soriano in1991 by gene trap experiment using embryonic stem (ES) cells infectedwith a retrovirus (Friedrich, G. and P. Soriano, Genes & Development,1991, 5, 1513-1523). The ROSA26 mouse gene trap line, where insertionoccurs in intron 1 of the ROSA26 locus, a non-essential site, displaysubiquitous expression of the reporter gene during embryonic development,in newborn (Friedrich and Soriano, 1991, precited) and in hematopoieticcells (Zambrowicz et al., Proc. Natl. Acad. Sci. USA, 1997, 94,3789-3794). The ROSA26 locus, located in mouse chromosome 6, producesthree transcripts (FIG. 1). Two transcripts originate from a commonpromoter share identical 5′ ends (exon 1 and start exon 2), but neithercontains a significant ORF. And a third one originated from the reversestrand (Zambrowicz et al., 1997, precited). Transgenes under the controlof the mouse ROSA26 promoter show ubiquitous expression in embryo andadult mouse (Soriano, P., Nature Genetics, 1999, 21, 70-71). Targetingthe ROSA26 locus in mouse ES cells has been largely used to constructtransgenic mouse models (Kisseberth et al., Developmental Biology, 1999,214, 128-138; Mao X. et al., Proc. Natl. Acad. Sci. USA, 1999, 96,5037-5042; Soriano, 1999, precited; Awatramani et al., Nature Genetics,2001, 29, 257-259; Mao X. et al., Blood, 2001, 97, 324-326; Possemato etal., Genesis, 2002, 32, 184-186; Mao, J. et al., Nucleic Acids Res.,2005, 33, e155; Yu et al., Proc. Natl. Acad. Sci. USA, 2005, 102,8615-8620; International PCT Applications WO 99/53017, WO 02/098217, WO03/020743, WO 2004/063381 and WO 2005/116070)).

However, the efficacy of homologous recombination in mouse cells is verylow (frequency: 10⁻⁶ to 10⁻⁹).

This efficiency can be enhanced by a DNA double-strand break (DSB) inthe targeted locus. Such DSBs can be created by Meganucleases, which areby definition sequence-specific endonucleases recognizing largesequences (Thierry, A. and B. Dujon, Nucleic Acids Res., 1992, 20,5625-5631). These proteins can cleave unique sites in living cells,thereby enhancing gene targeting by 1000-fold or more in the vicinity ofthe cleavage site (Puchta et al., Nucleic Acids Res., 1993, 21,5034-5040; Rouet et al., Mol. Cell. Biol., 1994, 14, 8096-8106; Choulikaet al., Mol. Cell. Biol., 1995, 15, 1968-1973; Puchta et al., Proc.Natl. Acad. Sci. U.S.A., 1996, 93, 5055-5060; Sargent et al., Mol. Cell.Biol., 1997, 17, 267-277; Cohen-Tannoudji et al., Mol. Cell. Biol.,1998, 18, 1444-1448; Donoho, et al., Mol. Cell. Biol., 1998, 18,4070-4078; Elliott et al., Mol. Cell. Biol., 1998, 18, 93-101).

However, although several hundreds of natural meganucleases, alsoreferred to as “homing endonucleases” have been identified (Chevalier,B. S, and B. L. Stoddard, Nucleic Acids Res., 2001, 29, 3757-3774), therepertoire of cleavable sequences is too limited to address thecomplexity of the genomes, and there is usually no cleavable site in achosen gene. Theoretically, the making of artificial sequence specificendonucleases with chosen specificities could alleviate this limit.Therefore, the making of meganucleases with tailored specificities isunder intense investigation.

Recently, fusion of Zinc-Finger Proteins with the catalytic domain ofthe FokI, a class IIS restriction endonuclease, were used to makefunctional sequence-specific endonucleases (Smith et al., Nucleic AcidsRes., 1999, 27, 674-681; Bibikova et al., Mol. Cell. Biol., 2001, 21,289-297; Bibikova et al., Genetics, 2002, 161, 1169-1175; Bibikova etal., Science, 2003, 300, 764; Porteus, M. H. and D. Baltimore, Science,2003, 300, 763-; Alwin et al., Mol. Ther., 2005, 12, 610-617; Urnov etal., Nature, 2005, 435, 646-651; Porteus, M. H., Mol. Ther., 2006, 13,438-446; International PCT Application WO 2007/014275). Such nucleasescould recently be used for the engineering of the ILR2G gene in humancells from the lymphoid lineage (Urnov et al., Nature, 2005, 435,646-651).

The binding specificity of Cys2-His2 type Zinc-Finger Proteins (ZFP), iseasy to manipulate, probably because they represent a simple(specificity driven by essentially four residues per finger), andmodular system (Pabo et al., Annu. Rev. Biochem., 2001, 70, 313-340;Jamieson et al., Nat. Rev. Drug Discov., 2003, 2, 361-368. Studies fromthe Pabo (Rebar, E. J. and C. O. Pabo, Science, 1994, 263, 671-673; Kim,J. S, and C. O. Pabo, Proc. Natl. Acad. Sci. USA, 1998, 95, 2812-2817),Klug (Choo, Y. and A. Klug, Proc. Natl. Acad. Sci. USA, 1994, 91,11163-11167; Isalan M. and A. Klug, Nat. Biotechnol., 2001, 19, 656-660)and Barbas (Choo, Y. and A. Klug, Proc. Natl. Acad. Sci. USA, 1994, 91,11163-11167; Isalan M. and A. Klug, Nat. Biotechnol., 2001, 19, 656-660)laboratories resulted in a large repertoire of novel artificial ZFPs,able to bind most G/ANNG/ANNG/ANN sequences.

Nevertheless, ZFPs might have their limitations, especially forapplications requiring a very high level of specificity, such astherapeutic applications. It was recently shown that FokI nucleaseactivity in fusion acts with either one recognition site or with twosites separated by varied distances via a DNA loop including in thepresence of some DNA-binding defective mutants of FokI (Catto et al.,Nucleic Acids Res., 2006, 34, 1711-1720). Thus, specificity might bevery degenerate, as illustrated by toxicity in mammalian cells andDrosophila (Bibikova et al., Genetics, 2002, 161, 1169-1175; Bibikova etal., Science, 2003, 300, 764-).

In the wild, meganucleases are essentially represented by homingendonucleases. Homing Endonucleases (HEs) are a widespread family ofnatural meganucleases including hundreds of proteins families(Chevalier, B. S, and B. L. Stoddard, Nucleic Acids Res., 2001, 29,3757-3774). These proteins are encoded by mobile genetic elements whichpropagate by a process called “homing”: the endonuclease cleaves acognate allele from which the mobile element is absent, therebystimulating a homologous recombination event that duplicates the mobileDNA into the recipient locus. Given their exceptional cleavageproperties in terms of efficacy and specificity, they could representideal scaffold to derive novel, highly specific endonucleases.

HEs belong to four major families. The LAGLIDADG family, named after aconserved peptidic motif involved in the catalytic center, is the mostwidespread and the best characterized group. Seven structures are nowavailable. Whereas most proteins from this family are monomeric anddisplay two LAGLIDADG motifs, a few ones have only one motif, butdimerize to cleave palindromic or pseudo-palindromic target sequences.

Although the LAGLIDADG peptide is the only conserved region amongmembers of the family, these proteins share a very similar architecture(FIG. 2). The catalytic core is flanked by two DNA-binding domains witha perfect two-fold symmetry for homodimers such as I-CreI (Chevalier, etal., Nat. Struct. Biol., 2001, 8, 312-316) and I-MsoI (Chevalier et al.,J. Mol. Biol., 2003, 329, 253-269) and with a pseudo-symmetry fomonomers such as I-SceI (Moure et al., J. Mol. Biol., 2003, 334, 685-69,I-DmoI (Silva et al., J. Mol. Biol., 1999, 286, 1123-1136) or I-AniI(Bolduc et al., Genes Dev., 2003, 17, 2875-2888). Both monomers, or bothdomains (for monomeric proteins) contribute to the catalytic core,organized around divalent cations. Just above the catalytic core, thetwo LAGLIDADG peptides play also an essential role in the dimerizationinterface. DNA binding depends on two typical saddle-shaped ββαββ folds,sitting on the DNA major groove. Other domains can be found, for examplein inteins such as PI-PfuI (Ichiyanagi et al., J. Mol. Biol., 2000, 300,889-901) and PI-SceI (Moure et al., Nat. Struct. Biol., 2002, 9,764-770), which protein splicing domain is also involved in DNA binding.

The making of functional chimeric meganucleases, by fusing theN-terminal I-DmoI domain with an I-CreI monomer (Chevalier et al., Mol.Cell., 2002, 10, 895-905; Epinat et al., Nucleic Acids Res, 2003, 31,2952-62; International PCT Applications WO 03/078619 and WO 2004/031346)have demonstrasted the plasticity of LAGLIDADG proteins.

Besides, different groups have used a rational approach to locally alterthe specificity of the I-CreI (Seligman et al., Genetics, 1997, 147,1653-1664; Sussman et al., J. Mol. Biol., 2004, 342, 31-41;International PCT Applications WO 2006/097784, WO 2006/097853 and WO2007/049156; Arnould et al., J. Mol. Biol., 2006, 355, 443-458; Rosen etal., Nucleic Acids Res., 2006, 34, 4791-4800; Smith et al., NucleicAcids Res., Epub 27 Nov. 2006), I-SceI (Doyon et al., J. Am. Chem. Soc.,2006, 128, 2477-2484), PI-SceI (Gimble et al., J. Mol. Biol., 2003, 334,993-1008) and I-MsoI (Ashworth et al., Nature, 2006, 441, 656-659).

In addition, hundreds of I-CreI derivatives with locally alteredspecificity were engineered by combining the semi-rational approach andHigh Throughput Screening:

-   -   Residues Q44, R68 and R70 or Q44, R68, D75 and I77 of I-CreI        were mutagenized and a collection of variants with altered        specificity towards the nucleotides at positions ±3 to 5 of the        DNA target (5NNN DNA target) were identified by screening        (International PCT Applications WO 2006/097784 and WO        2006/097853; Arnould et al., J. Mol. Biol., 2006, 355, 443-458;        Smith et al., Nucleic Acids Res., Epub 27 Nov. 2006).

Residues K28, N30 and Q38, N30, Y33 and Q38 or K28, Y33, Q38 and S40 ofI-CreI were mutagenized and a collection of variants with alteredspecificity towards the nucleotides at positions ±8 to 10 of the DNAtarget (10NNN DNA target) were identified by screening (Smith et al.,Nucleic Acids Res., Epub 27 Nov. 2006; International PCT Application WO2007/049156).

Residues 28 to 40 and 44 to 77 of I-CreI were shown to form twoseparable functional subdomains, able to bind distinct parts of a homingendonuclease half-site (Smith et al. Nucleic Acids Res., Epub 27 Nov.2006; International PCT Application WO 2007/049095).

The combination of mutations from the two subdomains of I-CreI withinthe same monomer allowed the design of novel chimeric molecules(homodimers) able to cleave a palindromic combined DNA target sequencecomprising the nucleotides at positions ±3 to 5 and ±8 to 10 which arebound by each subdomain (Smith et al., Nucleic Acids Res., Epub 27 Nov.2006; International PCT Application WO 2007/049156).

Two different variants were combined and assembled in a functionalheterodimeric endonuclease able to cleave a chimeric target resultingfrom the fusion of a different half of each variant DNA target sequence(Arnould et al., precited; International PCT Application WO2006/097854). Interestingly, the novel proteins had kept proper foldingand stability, high activity, and a narrow specificity

The combination of the two former steps allows a larger combinatorialapproach, involving four different subdomains. The different subdomainscan be modified separately and combined to obtain an entirely redesignedmeganuclease variant (heterodimer or single-chain molecule) with chosenspecificity, as illustrated on FIG. 3. In a first step, couples of novelmeganucleases are combined in new molecules (“half-meganucleases”)cleaving palindromic targets derived from the target one wants tocleave. Then, the combination of such “half-meganuclease” can result ina heterodimeric species cleaving the target of interest. The assembly offour set of mutations into heterodimeric endonucleases cleaving a modeltarget sequence or a sequence from the human RAG1 gene has beendescribed in Smith et al. (Nucleic Acids Res., Epub 27 Nov. 2006).

However, the targets tested in this report were identical to theoriginal sequence of the palindromic I-CreI site (C1221; FIG. 5) at thepositions ±2 and ±1. Even though the base-pairs ±1 and ±2 do not displayany contact with the protein, it has been shown that these positions arenot devoid of content information (Chevalier et al., J. Mol. Biol.,2003, 329, 253-269), especially for the base-pair ±1 and could be asource of additional substrate specificity (Argast et al., J. Mol.Biol., 1998, 280, 345-353; Jurica et al., Mol. Cell., 1998, 2, 469-476;Chevalier, B. S, and B. L. Stoddard, Nucleic Acids Res., 2001, 29,3757-3774). In vitro selection of cleavable I-CreI target (Argast etal., precited) randomly mutagenized, revealed the importance of thesefour base-pairs on protein binding and cleavage activity. It has beensuggested that the network of ordered water molecules found in theactive site was important for positioning the DNA target (Chevalier etal., Biochemistry, 2004, 43, 14015-14026). In addition, the extensiveconformational changes that appear in this region upon I-CreI bindingsuggest that the four central nucleotides could contribute to thesubstrate specificity, possibly by sequence dependent conformationalpreferences (Chevalier et al., 2003, precited).

Thus, it was not clear if mutants identified on 10NNN and 5NNN DNAtargets as homodimers cleaving a palindromic sequence with the fourcentral nucleotides being gtac, would allow the design of newendonucleases that would cleave targets containing changes in the fourcentral nucleotides.

The Inventors have identified a series of DNA targets in the mouseROSA26 locus that could be cleaved by I-CreI variants (FIG. 17). Thecombinatorial approach described in FIG. 3 was used to entirely redesignthe DNA binding domain of the I-CreI protein and thereby engineer novelmeganucleases with fully engineered specificity, to cleave a DNA targetfrom the mouse ROSA26 locus (rosa1) which differs from the I-CreI C122122 bp palindromic site by 13 nucleotides including one (position +1) ofthe four central nucleotides (FIG. 5).

Even though the combined variants were initially identified towardsnucleotides 10NNN and 5NNN respectively, and a strong impact of the fourcentral nucleotides of the target on the activity of the engineeredmeganuclease was observed, functional meganucleases with a profoundchange in specificity were selected. Furthermore, the activity of theengineered protein could be significantly improved by two successiverounds of random mutagenesis and screening, to compare with the activityof the I-CreI protein.

The ability to generate a double-strand break at the ROSA26 locusprovides a means to significantly enhance homologous recombination atthe locus. Thus, a meganuclease targeting the ROSA26 locus will allowefficient gene insertions in mouse cells (FIG. 4). The ability toefficiently insert genes (knock-in) at this locus has the advantage ofallowing reproducible expression levels as well as predictable timelines for generating insertions. Potential applications include theproduction of recombinant proteins in mouse cells and the engineering oftransgenic mice and recombinant mouse cell lines, that can be used, forexample, for protein production, gene function studies, drug screening,or as disease model.

The invention relates to an I-CreI variant wherein at least one of thetwo I-CreI monomers has at least two substitutions one in each of thetwo functional subdomains of the LAGLIDADG core domain situatedrespectively from positions 26 to 40 and 44 to 77 of I-CreI, and is ableto cleave a DNA target sequence from the mouse ROSA26 locus.

The cleavage activity of the variant according to the invention may bemeasured by any well-known, in vitro or in vivo cleavage assay, such asthose described in the International PCT Application WO 2004/067736;Epinat et al., Nucleic Acids Res., 2003, 31, 2952-2962; Chames et al.,Nucleic Acids Res., 2005, 33, e178 and Arnould et al., J. Mol. Biol.,2006, 355, 443-458. For example, the cleavage activity of the variant ofthe invention may be measured by a direct repeat recombination assay, inyeast or mammalian cells, using a reporter vector. The reporter vectorcomprises two truncated, non-functional copies of a reporter gene(direct repeats) and the genomic DNA target sequence within theintervening sequence, cloned in a yeast or a mammalian expressionvector. Expression of the variant results in a functional endonucleasewhich is able to cleave the genomic DNA target sequence. This cleavageinduces homologous recombination between the direct repeats, resultingin a functional reporter gene, whose expression can be monitored byappropriate assay.

DEFINITIONS

-   -   Amino acid refers to a natural or synthetic amino acid including        enantiomers and stereoisomers of the preceding amino acids.    -   Amino acid residues in a polypeptide sequence are designated        herein according to the one-letter code, in which, for example,        Q means Gln or Glutamine residue, R means Arg or Arginine        residue and D means Asp or Aspartic acid residue.    -   Acidic amino acid refers to aspartic acid (D) and Glutamic acid        (E).    -   Basic amino acid refers to lysine (K), arginine (R) and        histidine (H).    -   Small amino acid refers to glycine (G) and alanine (A).    -   Aromatic amino acid refers to phenylalanine (F), tryptophane (W)        and tyrosine (Y).    -   Nucleotides are designated as follows: one-letter code is used        for designating the base of a nucleoside: a is adenine, t is        thymine, c is cytosine, and g is guanine. For the degenerated        nucleotides, r represents g or a (purine nucleotides), k        represents g or t, s represents g or c, w represents a or t, m        represents a or c, y represents t or c (pyrimidine nucleotides),        d represents g, a or t, v represents g, a or c, b represents g,        t or c, h represents a, t or c, and n represents g, a, t or c.    -   by “meganuclease”, is intended an endonuclease having a        double-stranded DNA target sequence of 12 to 45 bp. Said        meganuclease is either a dimeric enzyme, wherein each domain is        on a monomer or a monomeric enzyme comprising the two domains on        a single polypeptide.    -   by “meganuclease domain” is intended the region which interacts        with one half of the DNA target of a meganuclease and is able to        associate with the other domain of the same meganuclease which        interacts with the other half of the DNA target to form a        functional meganuclease able to cleave said DNA target.    -   by “meganuclease variant” or “variant” is intented a        meganuclease obtained by replacement of at least one residue in        the amino acid sequence of the wild-type meganuclease (natural        meganuclease) with a different amino acid.    -   by “functional variant” is intended a variant which is able to        cleave a DNA target sequence, preferably said target is a new        target which is not cleaved by the parent meganuclease. For        example, such variants have amino acid variation at positions        contacting the DNA target sequence or interacting directly or        indirectly with said DNA target.    -   by “meganuclease variant with novel specificity” is intended a        variant having a pattern of cleaved targets different from that        of the parent meganuclease. The terms “novel specificity”,        “modified specificity”, “novel cleavage specificity”, “novel        substrate specificity” which are equivalent and used        indifferently, refer to the specificity of the variant towards        the nucleotides of the DNA target sequence.    -   by “I-CreI” is intended the wild-type I-CreI having the sequence        SWISSPROT P05725, corresponding to the sequence SEQ ID NO: 1 in        the sequence listing or the sequence pdb accession code 1g9y,        corresponding to the sequence SEQ ID NO: 133 in the sequence        listing.    -   by “domain” or “core domain” is intended the “LAGLIDADG homing        endonuclease core domain” which is the characteristic        α₁β₁β₂α₂β₃β₄α₃ fold of the homing endonucleases of the LAGLIDADG        family, corresponding to a sequence of about one hundred amino        acid residues. Said domain comprises four beta-strands        (β₁β₂β₃β₄) folded in an antiparallel beta-sheet which interacts        with one half of the DNA target. This domain is able to        associate with another LAGLIDADG homing endonuclease core domain        which interacts with the other half of the DNA target to form a        functional endonuclease able to cleave said DNA target. For        example, in the case of the dimeric homing endonuclease I-CreI        (163 amino acids), the LAGLIDADG homing endonuclease core domain        corresponds to the residues 6 to 94.    -   by “single-chain meganuclease” is intended a meganuclease        comprising two LAGLIDADG homing endonuclease domains or core        domains linked by a peptidic spacer. The single-chain        meganuclease is able to cleave a chimeric DNA target sequence        comprising one different half of each parent meganuclease target        sequence.    -   by “subdomain” is intended the region of a LAGLIDADG homing        endonuclease core domain which interacts with a distinct part of        a homing endonuclease DNA target half-site. Two different        subdomains behave independently and the mutation in one        subdomain does not alter the binding and cleavage properties of        the other subdomain. Therefore, two subdomains bind distinct        part of a homing endonuclease DNA target half-site.    -   by “beta-hairpin” is intended two consecutive beta-strands of        the antiparallel beta-sheet of a LAGLIDADG homing endonuclease        core domain ((β₁β₂ or, β₃β₄) which are connected by a loop or a        turn,    -   by “I-CreI site” is intended a 22 to 24 bp double-stranded DNA        sequence which is cleaved by I-CreI. I-CreI sites include the        wild-type (natural) non-palindromic I-CreI homing site and the        derived palindromic sequences such as the sequence        5′-t⁻¹²c⁻¹¹a⁻¹⁰a⁻⁹a⁻⁸a⁻⁷c⁻⁶g⁻⁵t⁻⁴c⁻³g⁻²t⁻¹a₊₁c₊₂g₊₃a₊₄c₊₅g₊₆t₊₇t₊₈t₊₉t₊₁₀g₊₁₁a₊₁₂        also called C1221 (SEQ ID NO:2; FIG. 5).    -   by “DNA target”, “DNA target sequence”, “target sequence”,        “target-site”, “target”, “site”; “site of interest”;        “recognition site”, “recognition sequence”, “homing recognition        site”, “homing site”, “cleavage site” is intended a 20 to 24 bp        double-stranded palindromic, partially palindromic        (pseudo-palindromic) or non-palindromic polynucleotide sequence        that is recognized and cleaved by a LAGLIDADG homing        endonuclease such as I-CreI, or a variant, or a single-chain        chimeric meganuclease derived from I-CreI. These terms refer to        a distinct DNA location, preferably a genomic location, at which        a double stranded break (cleavage) is to be induced by the        meganuclease. The DNA target is defined by the 5′ to 3′ sequence        of one strand of the double-stranded polynucleotide, as indicate        above for C1221. Cleavage of the DNA target occurs at the        nucleotides at positions +2 and −2, respectively for the sense        and the antisense strand. Unless otherwise indicated, the        position at which cleavage of the DNA target by an I-Cre I        meganuclease variant occurs, corresponds to the cleavage site on        the sense strand of the DNA target.    -   by “DNA target half-site”, “half cleavage site” or half-site” is        intended the portion of the DNA target which is bound by each        LAGLIDADG homing endonuclease core domain.    -   by “chimeric DNA target” or “hybrid DNA target” is intended the        fusion of a different half of two parent meganuclease target        sequences. In addition at least one half of said target may        comprise the combination of nucleotides which are bound by at        least two separate subdomains (combined DNA target).    -   by “mouse ROSA26 locus” is intended the locus located in mouse        chromosome 6 and having the sequence corresponding to EMBL        accession number CQ880114 (SEQ ID NO: 3; 13139 bp). The ROSA26        produces three transcripts (FIG. 1): two transcripts originate        from a common promoter share identical 5′ ends (exon 1 and exon        2 start), but neither contains a significant ORF. And a third        one originated from the reverse strand.    -   by “DNA target sequence from the mouse ROSA26 locus”, “genomic        DNA target sequence”, “genomic DNA cleavage site”, “genomic DNA        target” or “genomic target” is intended a 20 to 24 bp sequence        of the mouse ROSA26 locus which is recognized and cleaved by a        meganuclease variant.    -   by “vector” is intended a nucleic acid molecule capable of        transporting another nucleic acid to which it has been linked.    -   by “homologous” is intended a sequence with enough identity to        another one to lead to a homologous recombination between        sequences, more particularly having at least 95% identity,        preferably 97% identity and more preferably 99%.    -   “identity” refers to sequence identity between two nucleic acid        molecules or polypeptides. Identity can be determined by        comparing a position in each sequence which may be aligned for        purposes of comparison. When a position in the compared sequence        is occupied by the same base, then the molecules are identical        at that position. A degree of similarity or identity between        nucleic acid or amino acid sequences is a function of the number        of identical or matching nucleotides at positions shared by the        nucleic acid sequences. Various alignment algorithms and/or        programs may be used to calculate the identity between two        sequences, including FASTA, or BLAST which are available as a        part of the GCG sequence analysis package (University of        Wisconsin, Madison, Wis.), and can be used with, e.g., default        settings.    -   “individual” includes mammals, as well as other vertebrates        (e.g., birds, fish and reptiles). The terms “mammal” and        “mammalian”, as used herein, refer to any vertebrate animal,        including monotremes, marsupials and placental, that suckle        their young and either give birth to living young (eutharian or        placental mammals) or are egg-laying (metatharian or        nonplacental mammals). Examples of mammalian species include        humans and other primates (e.g., monkeys, chimpanzees), rodents        (e.g., rats, mice, guinea pigs) and others such as for example:        cows, pigs and horses.    -   by mutation is intended the substitution, deletion, insertion of        one or more nucleotides/amino acids in a polynucleotide (cDNA,        gene) or a polypeptide sequence. Said mutation can affect the        coding sequence of a gene or its regulatory sequence. It may        also affect the structure of the genomic sequence or the        structure/stability of the encoded mRNA.

The variant according to the present invention may be a homodimer or aheterodimer. Preferably, both monomers of the heterodimer are mutated atpositions 26 to 40 and/or 44 to 77. More preferably, both monomers havedifferent substitutions both at positions 26 to 40 and 44 to 77 ofI-CreI

In a preferred embodiment of said variant, said substitution(s) in thesubdomain situated from positions 44 to 77 of I-CreI are at positions44, 68, 70, 75 and/or 77.

In another preferred embodiment of said variant, said substitution(s) inthe subdomain situated from positions 26 to 40 of I-CreI are atpositions 26, 28, 30, 32, 33, 38 and/or 40.

In another preferred embodiment of said variant, said substitutions arereplacement of the initial amino acids with amino acids selected fromthe group consisting of: A, D, E, G, H, K, N, P, Q, R, S, T, Y, C, V, Land W.

In another preferred embodiment of said variant, it comprises one ormore mutations at positions of other amino acid residues which contactthe DNA target sequence or interact with the DNA backbone or with thenucleotide bases, directly or via a water molecule; these residues arewell-known in the art (Jurica et al., Molecular Cell., 1998, 2, 469-476;Chevalier et al., J. Mol. Biol., 2003, 329, 253-269).

In particular, additional substitutions may be introduced at positionscontacting the phosphate backbone, for example in the final C-terminalloop (positions 137 to 143; Prieto et al., Nucleic Acids Res., Epub 22Apr. 2007). Preferably said residues are involved in binding andcleavage of said DNA cleavage site. More preferably, said residues areat positions 138, 139, 142 or 143 of I-CreI. Two residues may be mutatedin one variant provided that each mutation is in a different pair ofresidues chosen from the pair of residues at positions 138 and 139 andthe pair of residues at positions 142 and 143. The mutations which areintroduced modify the interaction(s) of said amino acid(s) of the finalC-terminal loop with the phosphate backbone of the I-CreI site.Preferably, the residue at position 138 or 139 is substituted by anhydrophobic amino acid to avoid the formation of hydrogen bonds with thephosphate backbone of the DNA cleavage site. For example, the residue atposition 138 is substituted by an alanine or the residue at position 139is substituted by a methionine. The residue at position 142 or 143 isadvantageously substituted by a small amino acid, for example a glycine,to decrease the size of the side chains of these amino acid residues.More, preferably, said substitution in the final C-terminal loop modifythe specificity of the variant towards the nucleotide at positions ±1 to2, ±6 to 7 and/or ±11 to 12 of the I-CreI site.

In another preferred embodiment of said variant, it comprises one ormore additional mutations that improve the binding and/or the cleavageproperties of the variant towards the DNA target sequence from the mouseROSA26 locus.

The additional residues which are mutated may be on the entire I-CreIsequence, and in particular in the C-terminal half of I-CreI (positions80 to 163). For example, the variant comprises one or more additionalsubstitution at positions 19, 24, 79, 105, 107, 151, 153, 158. Saidsubstitutions are advantageously selected from the group consisting of:G19S, I24V, S79G, V105A, K107R, V151A, D153G and K158E.

The variant of the invention may be derived from the wild-type I-CreI(SEQ ID NO: 1 or 133) or an I-CreI scaffold protein having at least 85%identity, preferably at least 90% identity, more preferably at least 95%identity with SEQ ID NO: 133, such as the scaffold of SEQ ID NO: 4 (167amino acids) having the insertion of an alanine at position 2, thesubstitutionD75N, and the insertion of AAD at the C-terminus (positions164 to 166) of the I-CreI sequence.

In addition, the variants of the invention may include one or moreresidues inserted at the NH₂ terminus and/or COOH terminus of thesequence. For example, a tag (epitope (HA-tag (YPYDVPDYA; SEQ ID NO:135) or S-tag (KETAAAKFERQHMDS; SEQ ID NO: 136) or polyhistidinesequence) is introduced at the NH₂ terminus and/or COOH terminus; saidtag is useful for the detection and/or the purification of said variant.When the tag is introduced at the NH₂ terminus, the sequence of the tagmay either replace the first amino acids of the variant (at least thefirst methionine and eventually the second amino acid of the variant;tag starting with a methionine) or be inserted between the first(methionine) and the second amino acids or the first and the third aminoacids of the variant (tag with no methionine).

The variant may also comprise a nuclear localization signal (NLS); saidNLS is useful for the importation of said variant into the cell nucleus.An example of NLS is KKKRK (SEQ ID NO: 134). The NLS may be insertedjust after the first methionine of the variant or just after anN-terminal tag.

The variant according to the present invention may be ahomodimer whichis able to cleave a palindromic or pseudo-palindromic DNA targetsequence.

Alternatively, said variant is a heterodimer, resulting from theassociation of a first and a second monomer having differentsubstitutions at positions 26 to 40 and/or 44 to 77 of I-CreI, saidheterodimer being able to cleave a non-palindromic DNA target sequencefrom the mouse ROSA26 locus.

The DNA target sequence which is cleaved by said variant may be in anexon or in an intron of the mouse ROSA26 locus.

In another preferred embodiment of said variant, said DNA target isselected from the group consisting of the sequences SEQ ID NO: 5 to 30(FIG. 17) which cover all of the mouse ROSA26 locus.

TABLE I ROSA26 locus target sequences SEQ ID Target NO: Target sequenceposition* Target location 5 cgcccctgcgcaacgtggcagg 3220 Intron 1 6ccgcacccttctccggaggggg 3490 Intron 1 7 tggactggcttgactcatggca 4717Intron 1 8 ccagcctggtctacacatcaag 5584 Intron 1 9 ctatctaggatagccaggaata5608 Intron 1 10 cagcctgatttccagggtgggg 5906 Intron 1 11taaacctcataaaatagttatg 5992 Intron 1 12 tcagattcttttataggggaca 6409Intron 1 13 ttgtatatctcaaataatgctg 7394 Intron 1 14tgagccactgagaatggtctca 8070 Intron 1 15 caacatgatgttcataatccca 8304Exon 2 16 ttaaatgttgctatgcagtttg 8394 Exon 2 17 ttccccaaagttccaaattata8583 Exon 2 18 taacaccgtttgtgttataata 8678 Exon 2 19tatactgtctttagagagttta 8749 Exon 2 20 tgtaatagcttagaaaatttaa 9010 Exon 221 tttaatctattggtttgtctag 9280 Intron 2 22 ttgtacattgttaggagtgtga 9556Intron 2 23 tgcactggtacacataatttca 10263 Intron 2 24tgagatgatacaaagaatttag 11558 Intron 2 and antisense transcript 25ccatcctataaaagaaggtcaa 12391 Exon 3 or antisense transcript 26tttaatctattgcaaaaggtaa 12414 Exon 3 or antisense transcript 27tagtccagtgttatagagttag 12535 Exon 3 or antisense transcript 28ttctacctttttccaaatggca 12791 Exon 3 or antisense transcript 29ttttctgtggagacaaaggtaa 12904 Exon 3 or antisense transcript 30tgagatggctcagcaaataatg 12954 Exon 3 or antisense transcript *theindicated position is that of the first nucleotide of the target

More preferably, the monomers of the variant have at least the followingsubstitutions, respectively for the first and the second monomer:

-   -   N30H, Y33S, Q44E, R68C, R70S and D75N (first monomer), and N30D,        Y33R, Q38T, Q44K, R68E, R70S, and I77R (second monomer); this        variant cleaves the ROSA26 target SEQ ID NO: 5 which is located        in the first intron (FIGS. 1 and 17; Table I),    -   S32N, Y33G, Q44K, R70E and D75N (first monomer), and S32T, Q38W,        Q44K, R68E, R70S and I77R (second monomer); this variant cleaves        the ROSA26 target SEQ ID NO: 6 which is located in the first        intron (FIGS. 1 and 17; Table I),    -   Y33R, Q38N, S40Q, Q44N, R70S, D75R and I77D (first monomer), and        N30H, Y33S, Q44A, R70S, D75Q and I77E (second monomer); this        variant cleaves the ROSA26 target SEQ ID NO: 7 which is located        in the first intron (FIGS. 1 and 17; Table I),    -   K28S, Q38R, S40K, Q44D, R68Y, R70S, D75S and I77R (first        monomer), and Y33C, Q38A, R68A, R70K and D75N (second monomer);        this variant cleaves the ROSA26 target SEQ ID NO: 8 which is        located in the first intron (FIGS. 1 and 17; Table I),    -   Y33C, Q44T, R70S and D75Y (first monomer), and S32D, Q38C, Q44D,        R68Y, R70S, D75S and I77R (second monomer); this variant cleaves        the ROSA26 target SEQ ID NO: 9 which is located in the first        intron (FIGS. 1 and 17; Table I),    -   S32T, Y33C, R68T, R70N and D75N (first monomer), and S32T, Q38W,        Q44K, R70E and D75N (second monomer); this variant cleaves the        ROSA26 target SEQ ID NO: 10 which is located in the first intron        (FIGS. 1 and 17; Table I),    -   R70S, D75R and I77Y ((first monomer), and Y33R, Q38A, S40Q,        Q44A, R70S and D75N (second monomer); this variant cleaves the        ROSA26 target SEQ ID NO: 11 which is located in the first intron        (FIGS. 1 and 17; Table I),    -   K28S, Q38R, S40K, Q44T, R68N, R70N and D75N (first monomer), and        Y33H, Q38S, Q44K, R68Y, R70S, D75Q and I77N (second monomer);        this variant cleaves the ROSA26 target SEQ ID NO: 12 which is        located in the first intron (FIGS. 1 and 17; Table I),    -   K28A, Y33S, Q38R, S40K, Q44N, R68Y, R70S, D75R, I77V (first        monomer), and S32T, Y33C, Q44A, R70S and D75N (second monomer);        this variant cleaves the ROSA26 target SEQ ID NO: 13 which is        located in the first intron (FIGS. 1 and 17; Table I),    -   S32D, Y33H, Q44K, R68E, R70S and I77R (first monomer), and S32D,        Y33H, Q44D, R68N, R70S and D75N (second monomer); this variant        cleaves the ROSA26 target SEQ ID NO: 14 which is located in the        first intron (FIGS. 1 and 17; Table I),    -   N30R, S32D, R68S, R70K and D75N (first monomer), and D75N        (second monomer); this variant cleaves the ROSA26 target SEQ ID        NO: 16 which is located in the second exon (FIGS. 1 and 17;        Table I),    -   K28R, Y33A, Q38Y, S40Q, R68Y, R70S, D75R and I77Q (first        monomer), and Y33R, Q38A, S40Q, R70S and I77K (second monomer);        this variant cleaves the ROSA26 target SEQ ID NO: 17 which is        located in the second exon (FIGS. 1 and 17; Table I),    -   K28R, N30D, D75E and I77R (first monomer), and S32D, Q38C, Q44A,        R70S, D75R and I77Y (second monomer); this variant cleaves the        ROSA26 target SEQ ID NO: 18 which is located in the second exon        (FIGS. 1 and 17; Table I),    -   Y33R, Q38A, S40Q, R70S, and D75N (first monomer), and R70S, D75Y        and I77R (second monomer); this variant cleaves the ROSA26        target SEQ ID NO: 19 which is located in the second exon (FIGS.        1 and 17; Table I),    -   Y33R, Q38A, S40Q, Q44N, R68Y, R70S, D75R and I77V (first        monomer), and N30R, S32D, Q44T, R68H, R70H and D75N (second        monomer); this variant cleaves the ROSA26 target SEQ ID NO: 20        which is located in the second exon (FIGS. 1 and 17; Table I),    -   Y33P, S40Q, Q44K, R68Y, R70S, D75Q and I77N (first monomer), and        S32A, Y33C, R68Y, R70S, D75R and I77Q (second monomer); this        variant cleaves the ROSA26 target SEQ ID NO: 21 which is located        in the second intron (FIGS. 1 and 17; Table I),    -   K28A, Y33S, Q38R, S40K, R68N, R70S, D75N and I77R (first        monomer), and S32N, Y33G, Q44A, R68A, R70K and D75N (second        monomer); this variant cleaves the ROSA26 target SEQ ID NO: 22        which is located in the second intron (FIGS. 1 and 17; Table I),    -   N30H, Y33S, Q44Y, R70S and I77V (first monomer), and Y33R, S40Q,        Q44A, R70S and D75N (second monomer); this variant cleaves the        ROSA26 target SEQ ID NO: 23 which is located in the second        intron (FIGS. 1 and 17; Table I),    -   S32D, Y33H, R68T, R70N and D75N (first monomer), and N30R, S32D,        Q44T, R68N, R70N and D75N (second monomer); this variant cleaves        the ROSA26 target SEQ ID NO: 24 which is located in the second        intron and in the antisense transcript (FIGS. 1 and 17; Table        I),    -   S32R, Y33D, Q44A, R70S and D75N (first monomer), and Y33S, Q38R,        S40H, R68H, R70H and D75N (second monomer); this variant cleaves        the ROSA26 target SEQ ID NO: 25 which is located in the third        exon or the antisense transcript (FIGS. 1 and 17; Table I),    -   Y33P, S40Q, Q44K, R68Y, R70S, D75Q and I77N (first monomer), and        K28R, Y33A, Q38Y, S40Q, Q44T, R68H, R70H and D75N (second        monomer); this variant cleaves the ROSA26 target SEQ ID NO: 26        which is located in the third exon or the antisense transcript        (FIGS. 1 and 17; Table I),    -   K28Q, Q38R, S40K, Q44A, R70S, D75E and I77R (first monomer), and        N30R, S32D, Q44K, R68Y, R70S, D75Q and I77N (second monomer);        this variant cleaves the ROSA26 target SEQ ID NO: 27 which is        located in the third exon or the antisense transcript (FIGS. 1        and 17; Table I),    -   Y33T, Q38A, R68H, R70H and D75N (first monomer), and N30H, Y33S,        R70S and I77K (second monomer); this variant cleaves the ROSA26        target SEQ ID NO: 28 which is located in the third exon or the        antisense transcript (FIGS. 1 and 17; Table I),    -   Y33T, S40T, R68A, R70K and D75N (first monomer), and K28R, Y33A,        Q38Y, S40Q, R70S and I77K (second monomer); this variant cleaves        the ROSA26 target SEQ ID NO: 29 which is located in the third        exon or the antisense transcript (FIGS. 1 and 17; Table I), and    -   S32D, Y33H, Q44N, R70S, D75R and I77D (first monomer), and S32D,        Q38C, R70S and I77K (second monomer); this variant cleaves the        ROSA26 target SEQ ID NO: 30 which is located in the third exon        or the antisense transcript (FIGS. 1 and 17; Table I).

Examples of said variants cleaving the ROSA26 DNA targets of Table I(nucleotide sequences SEQ ID NO: 5 to 14 and 16 to 30) include thevariants having a first monomer of any of the amino acid sequences SEQID NO: 82 to 106 and a second monomer of any of the amino acid sequencesSEQ ID NO: 107 to 116, 4, 117 to 130, respectively (FIG. 17).

In addition, the following variants are able to cleave the ROSA26 DNAtarget, named rosa1, which is located in the second exon (FIGS. 1 and17; Table I):

-   -   the fourty variants having a first monomer selected from the        group consisting of: I24V, Q44Y, R70S and D75N; I24V, Q44Y,        R68Y, R70S, D75Y and I77R; I24V, Q44Y, R70S, D75N and I77V;        I24V, Q44Y, R68N, R70S and D75R; I24V, Q44Y, R68S, R70S and        D75R; I24V, Q44Y, R70S and D75Q; I24V, Q44Y, R68Y, R70S, D75R        and I77V; I24V, Q44Y, R70S, D75Y and I77T, and a second monomer        selected in the group consisting of: K28E, Y33R, Q38R, S40R,        Q44A, R68H, R70Q and D75N; K28E, Y33R, Q38R, S40R, Q44A, R70N        and D75N; K28E, Y33R, Q38R, S40K, Q44A, R68H, R70Q and D75N;        K28E, Y33R, Q38R, S40K, Q44V, R70A and D75N; K28E, Y33R, Q38R,        S40K, Q44A, R70G and D75N; examples of these variants are        presented in Table V (first monomer: m2, m6, m8, m12, m13, m14,        m16 or m17 (SEQ ID NO: 39, 43, 45, 49, 50, 51, 53 and 54);        second monomer any of the SEQ ID NO: 60, 61, 63, 65 and 66)    -   the variant having a first monomer comprising I24V, Q44Y, R70S,        D75Y and I77T and K28E, Y33R, Q38R, S40R, Q44A, R68S, R70Q and        D75N (second monomer); an example of this variant is presented        in Table V (first monomer m17 (SEQ ID NO:54) and second monomer        SEQ ID NO: 62).    -   the ten variants having a first monomer selected from the group        consisting of I24V, Q44Y, R68N, R70S and D75R; I24V, Q44Y, R68S,        R70S and D75R; I24V, Q44Y, R70S and D75Q; I24V, Q44Y, R68Y,        R70S, D75R and I77V; I24V, Q44Y, R70S, D75Y and I77T, and a        second monomer selected in the group consisting of: K28E, Y33R,        Q38R, S40K, Q44A, R70S and D75N and K28E, Y33R, Q38R, S40K,        Q44A, R68T, R70N and D75N; examples of these variants are        presented in Table V (first monomer: m12, m13, m14, m16 or m17        (SEQ ID NO: 49, 50, 51, 53 and 54); second monomer any of the        SEQ ID NO: 64 and 67).    -   the variants having a first monomer consisting of the sequence        SEQ ID NO: 72 (MO_(—)1; Tables VI and VII) or SEQ ID NO: 73        (MO_(—)2; Tables VI and VII) and a second monomer consisting of        any of the sequences SEQ ID NO: 74 to 77 (mO_(—)1 to mO_(—)4;        Table VII); these eight variants have additional substitutions        that increase the cleavage activity of the variants for the        rosa1 target.

The invention encompasses I-CreI variants having at least 85% identity,preferably at least 90% identity, more preferably at least 95% (96%,97%, 98%, 99%) identity with the sequences as defined above, saidvariant being able to cleave a DNA target from the mouse ROSA26 locus.

For example, the invention encompasses the I-CreI variants derived fromMO_(—)1 and mO_(—)2 by insertion of a NLS, a tag or both, which areselected from the group consisting of the sequences SEQ ID NO: 140 to145.

The heterodimeric variant is advantageously an obligate heterodimervariant having at least one pair of mutations interesting correspondingresidues of the first and the second monomers which make anintermolecular interaction between the two I-CreI monomers, wherein thefirst mutation of said pair(s) is in the first monomer and the secondmutation of said pair(s) is in the second monomer and said pair(s) ofmutations prevent the formation of functional homodimers from eachmonomer and allow the formation of a functional heterodimer, able tocleave the genomic DNA target from the mouse ROSA26 locus.

To form an obligate heterodimer, the monomers have advantageously atleast one of the following pairs of mutations, respectively for thefirst and the second monomer:

a) the substitution of the glutamic acid at position 8 with a basicamino acid, preferably an arginine (first monomer) and the substitutionof the lysine at position 7 with an acidic amino acid, preferably aglutamic acid (second monomer); the first monomer may further comprisethe substitution of at least one of the lysine residues at positions 7and 96, by an arginine.

b) the substitution of the glutamic acid at position 61 with a basicamino acid, preferably an arginine (first monomer) and the substitutionof the lysine at position 96 with an acidic amino acid, preferably aglutamic acid (second monomer); the first monomer may further comprisethe substitution of at least one of the lysine residues at positions 7and 96, by an arginine

c) the substitution of the leucine at position 97 with an aromatic aminoacid, preferably a phenylalanine (first monomer) and the substitution ofthe phenylalanine at position 54 with a small amino acid, preferably aglycine (second monomer); the first monomer may further comprise thesubstitution of the phenylalanine at position 54 by a tryptophane andthe second monomer may further comprise the substitution of the leucineat position 58 or lysine at position 57, by a methionine, and

d) the substitution of the aspartic acid at position 137 with a basicamino acid, preferably an arginine (first monomer) and the substitutionof the arginine at position 51 with an acidic amino acid, preferably aglutamic acid (second monomer).

For example, the first monomer may have the mutation D137R and thesecond monomer, the mutation R51D. The obligate heterodimer meganucleasecomprises advantageously, at least two pairs of mutations as defined ina), b) c) or d), above; one of the pairs of mutation is advantageouslyas defined in c) or d). Preferably, one monomer comprises thesubstitution of the lysine residues at positions 7 and 96 by an acidicamino acid (aspartic acid (D) or glutamic acid (E)), preferably anaspartic acid (K7E and K96E) and the other monomer comprises thesubstitution of the glutamic acid residues at positions 8 and 61 by abasic amino acid (arginine (R) or lysine (K); for example, E8K andE61R). More preferably, the obligate heterodimer meganuclease, comprisesthree pairs of mutations as defined in a), b) and c), above. Theobligate heterodimer meganuclease consists advantageously of (i) E8R,E8K or E8H, E61R, E61K or E61H and L97F, L97W or L97Y; (ii) K7R, E8R,E61R, K96R and L97F, or (iii) K7R, E8R, F54W, E61R, K96R and L97F and asecond monomer (B) having at least the mutations (iv) K7E or K7D, F54Gor F54A and K96D or K96E; (v) K7E, F54G, L58M and K96E, or (vi) K7E,F54G, K57M and K96E. For example, the first monomer may have themutations K7R, E8R or E8K, E61R, K96R and L97F or K7R, E8R or E8K, F54W,E61R, K96R and L97F and the second monomer, the mutations K7E, F54G,L58M and K96E or K7E, F54G, K57M and K96E. An example of heterodimer isSEQ ID NO: 147 and SEQ ID NO: 148. The obligate heterodimer may compriseat least one NLS and/or one tag as defined above; said NLS and/or tagmay be in the first and/or the second monomer.

The subject-matter of the present invention is also a single-chainchimeric meganuclease (fusion protein) derived from an I-CreI variant asdefined above. The single-chain meganuclease may comprise two I-CreImonomers, two I-CreI core domains (positions 6 to 94 of I-CreI) or acombination of both. Preferably, the two monomers/core domains or thecombination of both, are connected by a peptidic linker. An example ofpeptidic linker is SEQ ID NO: 149. An example of single-chain chimericmeganuclease is SEQ ID NO: 146. The single-chain chimeric meganucleasemay further comprise at least one NLS and/or one tag as defined above;said NLS and/or tag may be in the first and/or the second monomer.

The subject-matter of the present invention is also a polynucleotidefragment encoding a variant or a single-chain chimeric meganuclease asdefined above; said polynucleotide may encode one monomer of ahomodimeric or heterodimeric variant, or two domains/monomers of asingle-chain chimeric meganuclease.

The subject-matter of the present invention is also a recombinant vectorfor the expression of a variant or a single-chain meganuclease accordingto the invention. The recombinant vector comprises at least onepolynucleotide fragment encoding a variant or a single-chainmeganuclease, as defined above. In a preferred embodiment, said vectorcomprises two different polynucleotide fragments, each encoding one ofthe monomers of a heterodimeric variant.

A vector which can be used in the present invention includes, but is notlimited to, a viral vector, a plasmid, a RNA vector or a linear orcircular DNA or RNA molecule which may consists of a chromosomal, nonchromosomal, semi-synthetic or synthetic nucleic acids. Preferredvectors are those capable of autonomous replication (episomal vector)and/or expression of nucleic acids to which they are linked (expressionvectors). Large numbers of suitable vectors are known to those of skillin the art and commercially available.

Viral vectors include retrovirus, adenovirus, parvovirus (e.g.adeno-associated viruses), coronavirus, negative strand RNA viruses suchas orthomyxovirus (e.g., influenza virus), rhabdovirus (e.g., rabies andvesicular stomatitis virus), paramyxovirus (e.g. measles and Sendai),positive strand RNA viruses such as picornavirus and alphavirus, anddouble-stranded DNA viruses including adenovirus, herpesvirus (e.g.,Herpes Simplex virus types 1 and 2, Epstein-Barr virus,cytomegalovirus), and poxvirus (e.g., vaccinia, fowlpox and canarypox).Other viruses include Norwalk virus, togavirus, flavivirus, reoviruses,papovavirus, hepadnavirus, and hepatitis virus, for example. Examples ofretroviruses include: avian leukosis-sarcoma, mammalian C-type, B-typeviruses, D type viruses, HTLV-BLV group, lentivirus, spumavirus (Coffin,J. M., Retroviridae: The viruses and their replication, In FundamentalVirology, Third Edition, B. N. Fields, et al., Eds., Lippincott-RavenPublishers, Philadelphia, 1996).

Preferred vectors include lentiviral vectors, and particularly selfinactivating lentiviral vectors.

Vectors can comprise selectable markers, for example: neomycinphosphotransferase, histidinol dehydrogenase, dihydrofolate reductase,hygromycin phosphotransferase, herpes simplex virus thymidine kinase,adenosine deaminase, glutamine synthetase, and hypoxanthine-guaninephosphoribosyl transferase for eukaryotic cell culture; TRP 1 for S.cerevisiae; tetracycline, rifampicin or ampicillin resistance in E.coli.

Preferably said vectors are expression vectors, wherein the sequence(s)encoding the variant/single-chain meganuclease of the invention isplaced under control of appropriate transcriptional and translationalcontrol elements to permit production or synthesis of said variant.Therefore, said polynucleotide is comprised in an expression cassette.More particularly, the vector comprises a replication origin, a promoteroperatively linked to said encoding polynucleotide, a ribosome-bindingsite, an RNA-splicing site (when genomic DNA is used), a polyadenylationsite and a transcription termination site. It also can comprise anenhancer. Selection of the promoter will depend upon the cell in whichthe polypeptide is expressed. Preferably, when said variant is aheterodimer, the two polynucleotides encoding each of the monomers areincluded in one vector which is able to drive the expression of bothpolynucleotides, simultaneously. Suitable promoters include tissuespecific and/or inducible promoters. Examples of inducible promotersare: eukaryotic metallothionine promoter which is induced by increasedlevels of heavy metals, prokaryotic lacZ promoter which is induced inresponse to isopropyl-β-D-thiogalacto-pyranoside (IPTG) and eukaryoticheat shock promoter which is induced by increased temperature. Examplesof tissue specific promoters are skeletal muscle creatine kinase,prostate-specific antigen (PSA), α-antitrypsin protease, humansurfactant (SP) A and B proteins, β-casein and acidic whey proteingenes.

According to another advantageous embodiment of said vector, it includesa targeting construct comprising sequences sharing homologies with theregion surrounding the genomic DNA cleavage site as defined above.

Alternatively, the vector coding for an I-CreI variant/single-chainmeganuclease and the vector comprising the targeting construct aredifferent vectors.

More preferably, the targeting DNA construct comprises:

a) sequences sharing homologies with the region surrounding the genomicDNA cleavage site as defined above, and

b) a sequence to be introduced flanked by sequences as in a).

For gene knock-in at the mouse the ROSA26 locus, the sequence to beintroduced comprises an exogenous gene expression cassette or partthereof and eventually a selection marker, such as the HPRT gene.

Alternatively, the sequence to be introduced can be any other sequenceused to alter the mouse ROSA26 locus in some specific way including asequence used to modify a specific sequence in the mouse ROSA26 locus,to attenuate or activate the mouse ROSA26 locus or part thereof, tointroduce a mutation into a site of interest of the mouse ROSA26 locus,or to inactivate or delete the mouse ROSA26 locus or a part thereof.

Preferably, homologous sequences of at least 50 bp, preferably more than100 bp and more preferably more than 200 bp are used for repairing thecleavage site. Indeed, shared DNA homologies are located in regionsflanking upstream and downstream the site of the break and the DNAsequence to be introduced should be located between the two arms.

Therefore, the targeting construct is preferably from 200 pb to 6000 pb,more preferably from 1000 pb to 2000 pb.

For the insertion of a sequence, DNA homologies are generally located inregions directly upstream and downstream to the site of the break(sequences immediately adjacent to the break; minimal repair matrix).However, when the insertion is associated with a deletion of sequencesflanking the cleavage site, shared DNA homologies are located in regionsupstream and downstream the region of the deletion.

For example, the mouse ROSA 26 DNA targets which are cleaved by thevariants as defined above and the minimal matrix for repairing each ofthe cleavage generated by each variant, are indicated in FIG. 17.

The subject-matter of the present invention is also a compositioncharacterized in that it comprises at least one meganuclease as definedabove (variant or single-chain derived chimeric meganuclease) and/or atleast one expression vector encoding said meganuclease, as definedabove.

In a preferred embodiment of said composition, it comprises a targetingDNA construct as defined above.

Preferably, said targeting DNA construct is either included in arecombinant vector or it is included in an expression vector comprisingthe polynucleotide(s) encoding the meganuclease according to theinvention.

The subject-matter of the present invention is further the use of ameganuclease as defined above, one or two polynucleotide(s), preferablyincluded in expression vector(s), for genome engineering at the mouseROSA26 locus, for non-therapeutic purposes.

According to an advantageous embodiment of said use, it is for inducinga double-strand break in a site of interest of the mouse ROSA26 locuscomprising a genomic DNA target sequence, thereby inducing a DNArecombination event, a DNA loss or cell death.

According to the invention, said double-strand break is for: modifying aspecific sequence in the ROSA26 locus, attenuating or activating theendogenous ROSA26 locus, introducing a mutation into a site of interestof the ROSA26 locus, introducing an exogenous gene or a part thereof,inactivating or deleting the endogenous ROSA26 locus or a part thereof,translocating a chromosomal arm, or leaving the DNA unrepaired anddegraded.

According to another advantageous embodiment of said use, said variant,polynucleotide(s), vector, are associated with a targeting DNA constructas defined above.

In a preferred embodiment of the use of the meganuclease according tothe present invention, it comprises at least the following steps: 1)introducing a double-strand break at a site of interest of the mouseROSA26 locus comprising at least one recognition and cleavage site ofsaid meganuclease, by contacting said cleavage site with saidmeganuclease; 2) providing a targeting DNA construct comprising thesequence to be introduced flanked by sequences sharing homologies to thetargeted locus. Said meganuclease can be provided directly to the cellor through an expression vector comprising the polynucleotide sequenceencoding said meganuclease and suitable for its expression in the usedcell. This strategy is used to introduce a DNA sequence at the targetsite, for example to generate knock-in transgenic mice or recombinantmouse cell lines that can be used for protein production, gene functionstudies, drug development (drug screening) or as disease model.

The subject-matter of the present invention is also a method for makinga transgenic mouse expressing a product of interest, comprising at leastthe step of:

(a) introducing into a mouse pluripotent precursor cell or a mouseembryo, a meganuclease, as defined above, so as to into induce a doublestranded cleavage at a site of interest of the ROSA26 locus comprising aDNA recognition and cleavage site of said meganuclease; simultaneouslyor consecutively,

(b) introducing into the mouse precursor cell or embryo of step (a) atargeting DNA, comprising at least a sequence encoding a product ofinterest flanked by sequences sharing homologies to the regionsurrounding the cleavage site, so as to generate a genomically modifiedmouse precursor cell or embryo having inserted the sequence of interestby homologous recombination between the targeting DNA and thechromosomal DNA,

(c) developing the genomically modified mouse precursor cell or embryoof step (b) into a chimeric mouse, and

(d) deriving a transgenic mouse from the chimeric mouse of step (c).

Preferably, step (c) comprises the introduction of the genomicallymodified precursor cell generated in step (b) into blastocysts so as togenerate chimeric mice.

According to a preferred embodiment of said method, it comprises afurther step (e) of recovering the product of interest from thetransgenic mouse, by any means.

The subject-matter of the present invention is also a method for makinga recombinant mouse cell expressing a product of interest, comprising atleast the step of:

(a) introducing into a mouse cell, a meganuclease, as defined above, soas to into induce a double stranded cleavage at a site of interest ofthe ROSA26 locus comprising a DNA recognition and cleavage site for saidmeganuclease, simultaneously or consecutively,

(b) introducing into the cell of step (a), a targeting DNA, wherein saidtargeting DNA comprising at least a sequence encoding a product ofinterest flanked by sequences sharing homologies to the regionsurrounding the cleavage site, so as to generate a recombinant mousecell having inserted the sequence of interest by homologousrecombination between the targeting DNA and the chromosomal DNA,

(c) isolating the recombinant mouse cell of step (b), by any appropriatemean.

According to a preferred embodiment of said method, it comprises afurther step (d) of recovering the product of interest from therecombinany mouse cell, by any means.

The targeting DNA is introduced into the cell under conditionsappropriate for introduction of the targeting DNA into the site ofinterest.

In a preferred embodiment, said targeting DNA construct is inserted in avector.

The cell which is modified may be any cell of interest. For makingtransgenic mice, the cells are pluripotent precursor cells such asembryo-derived stem (ES) cells, which are well-known in the art. Formaking recombinant mouse cell lines, the cells may advantageously beNSO, SP2/0 (BALB/c myeloma; ECACC #85110503 and #85072401), or L (ATCC #CRL-2648) cells. Said meganuclease can be provided directly to the cellor through an expression vector comprising the polynucleotide sequenceencoding said meganuclease and suitable for its expression in the usedcell.

For making transgenic animals/recombinant cell lines expressing aproduct of interest, the targeting DNA comprises a sequence encoding theproduct of interest (protein or RNA), and eventually a selectable markergene, flanked by sequences upstream and downsteam the meganuclease sitein the mouse ROSA26 locus, as defined above, so as to generategenomically modified cells (animal precursor cell or embryo/animal orhuman cell) having integrated the exogenous sequence of interest at themeganuclease site in the ROSA26 locus, by homologous recombination.

The sequence of interest may be any gene coding for a certainprotein/peptide of interest, included but not limited to: reportergenes, receptors, signaling molecules, transcription factors,pharmaceutically active proteins and peptides, disease causing geneproducts and toxins. The sequence may also encode an RNA molecule ofinterest including for example a siRNA.

The expression of the exogenous sequence may be driven, either by theendogenous ROSA26 promoter or by an heterologous promoter, preferably aubiquitous or tissue specific promoter, either constitutive orinducible, as defined above. In addition, the expression of the sequenceof interest may be conditional; the expression may be induced by asite-specific recombinase (Cre, FLP . . . ).

Thus, the sequence of interest is inserted in an appropriate cassettethat may comprise an heterologous promoter operatively linked to saidgene of interest and one or more functional sequences including but motlimited to (selectable) marker genes, recombinase recognition sites,polyadenylation signals, splice acceptor sequences, introns, tags forprotein detection and enhancers.

Alternatively, the appropriate cassette may comprise an InternalRibosomal Entry site (IRES) operatively linked to said gene of interestand one or more functional sequences including but n events with theIRES-Hygro matrix (pCLS1675).ot limited to (selectable) marker genes,recombinase recognition sites, polyadenylation signals, splice acceptorsequences, introns, tags for protein detection and enhancers.

The meganuclease can be used either as a polypeptide or as apolynucleotide construct encoding said polypeptide. It is introducedinto mouse cells, by any convenient means well-known to those in theart, which are appropriate for the particular cell type, alone or inassociation with either at least an appropriate vehicle or carrierand/or with the targeting DNA.

According to an advantageous embodiment of the uses according to theinvention, the meganuclease (polypeptide) is associated with:

-   -   liposomes, polyethyleneimine (PEI); in such a case said        association is administered and therefore introduced into        somatic target cells.    -   membrane translocating peptides (Bonetta, The Scientist, 2002,        16, 38; Ford et al., Gene Ther., 2001, 8, 1-4; Wadia and Dowdy,        Curr. Opin. Biotechnol., 2002, 13, 52-56); in such a case, the        sequence of the variant/single-chain meganuclease is fused with        the sequence of a membrane translocating peptide (fusion        protein).

According to another advantageous embodiment of the uses according tothe invention, the meganuclease (polynucleotide encoding saidmeganuclease) and/or the targeting DNA is inserted in a vector. Vectorscomprising targeting DNA and/or nucleic acid encoding a meganuclease canbe introduced into a cell by a variety of methods (e.g., injection,direct uptake, projectile bombardment, liposomes, electroporation).Meganucleases can be stably or transiently expressed into cells usingexpression vectors. Techniques of expression in eukaryotic cells arewell known to those in the art. (See Current Protocols in HumanGenetics: Chapter 12 “Vectors For Gene Therapy” & Chapter 13 “DeliverySystems for Gene Therapy”). Optionally, it may be preferable toincorporate a nuclear localization signal into the recombinant proteinto be sure that it is expressed within the nucleus.

Once in a cell, the meganuclease and if present, the vector comprisingtargeting DNA and/or nucleic acid encoding a meganuclease are importedor translocated by the cell from the cytoplasm to the site of action inthe nucleus.

In one embodiment of the uses according to the present invention, themeganuclease is substantially non-immunogenic, i.e., engender little orno adverse immunological response. A variety of methods for amelioratingor eliminating deleterious immunological reactions of this sort can beused in accordance with the invention. In a preferred embodiment, themeganuclease is substantially free of N-formyl methionine. Another wayto avoid unwanted immunological reactions is to conjugate meganucleasesto polyethylene glycol (“PEG”) or polypropylene glycol (“PPG”)(preferably of 500 to 20,000 daltons average molecular weight (MW)).Conjugation with PEG or PPG, as described by Davis et al. (U.S. Pat. No.4,179,337) for example, can provide non-immunogenic, physiologicallyactive, water soluble endonuclease conjugates with anti-viral activity.Similar methods also using a polyethylene-polypropylene glycol copolymerare described in Saifer et al. (U.S. Pat. No. 5,006,333).

The invention also concerns a prokaryotic or eukaryotic host cell whichis modified by a polynucleotide or a vector as defined above, preferablyan expression vector.

The invention also concerns a non-human transgenic animal or atransgenic plant, characterized in that all or parts of their cells aremodified by a polynucleotide or a vector as defined above.

As used herein, a cell refers to a prokaryotic cell, such as a bacterialcell, or an eukaryotic cell, such as an animal, plant or yeast cell.

The subject-matter of the present invention is also the use of at leastone meganuclease variant, as defined above, as a scaffold for makingother meganucleases. For example a third round of mutagenesis andselection/screening can be performed on said variants, for the purposeof making novel, third generation meganucleases.

The different uses of the meganuclease and the methods of using saidmeganuclease according to the present invention include the use of theI-CreI variant, the single-chain chimeric meganuclease derived from saidvariant, the polynucleotide(s), vector, cell, transgenic plant ornon-human transgenic mammal encoding said variant or single-chainchimeric meganuclease, as defined above.

The I-CreI variant according to the invention may be obtained by amethod for engineering I-CreI variants able to cleave a genomic DNAtarget sequence from the mouse ROSA26 locus, comprising at least thesteps of:

(a) constructing a first series of I-CreI variants having at least onesubstitution in a first functional subdomain of the LAGLIDADG coredomain situated from positions 26 to 40 of I-CreI,

(b) constructing a second series of I-CreI variants having at least onesubstitution in a second functional subdomain of the LAGLIDADG coredomain situated from positions 44 to 77 of I-CreI,

(c) selecting and/or screening the variants from the first series ofstep (a) which are able to cleave a mutant I-CreI site wherein at least(i) the nucleotide triplet at positions −10 to −8 of the I-CreI site hasbeen replaced with the nucleotide triplet which is present at positions−10 to −8 of said genomic target and (ii) the nucleotide triplet atpositions +8 to +10 has been replaced with the reverse complementarysequence of the nucleotide triplet which is present at positions −10 to−8 of said genomic target,

(d) selecting and/or screening the variants from the second series ofstep (b) which are able to cleave a mutant I-CreI site wherein at least(i) the nucleotide triplet at positions −5 to −3 of the I-CreI site hasbeen replaced with the nucleotide triplet which is present at positions−5 to −3 of said genomic target and (ii) the nucleotide triplet atpositions +3 to +5 has been replaced with the reverse complementarysequence of the nucleotide triplet which is present at positions −5 to−3 of said genomic target,

(e) selecting and/or screening the variants from the first series ofstep (a) which are able to cleave a mutant I-CreI site wherein at least(i) the nucleotide triplet at positions +8 to +10 of the I-CreI site hasbeen replaced with the nucleotide triplet which is present at positions+8 to +10 of said genomic target and (ii) the nucleotide triplet atpositions −10 to −8 has been replaced with the reverse complementarysequence of the nucleotide triplet which is present at positions +8 to+10 of said genomic target,

(f) selecting and/or screening the variants from the second series ofstep (b) which are able to cleave a mutant I-CreI site wherein at least(i) the nucleotide triplet at positions +3 to +5 of the I-CreI site hasbeen replaced with the nucleotide triplet which is present at positions+3 to +5 of said genomic target and (ii) the nucleotide triplet atpositions −5 to −3 has been replaced with the reverse complementarysequence of the nucleotide triplet which is present at positions +3 to+5 of said genomic target,

(g) combining in a single variant, the mutation(s) at positions 26 to 40and 44 to 77 of two variants from step (c) and step (d), to obtain anovel homodimeric I-CreI variant which cleaves a sequence wherein (i)the nucleotide triplet at positions −10 to −8 is identical to thenucleotide triplet which is present at positions −10 to −8 of saidgenomic target, (ii) the nucleotide triplet at positions +8 to +10 isidentical to the reverse complementary sequence of the nucleotidetriplet which is present at positions −10 to −8 of said genomic target,(iii) the nucleotide triplet at positions −5 to −3 is identical to thenucleotide triplet which is present at positions −5 to −3 of saidgenomic target and (iv) the nucleotide triplet at positions +3 to +5 isidentical to the reverse complementary sequence of the nucleotidetriplet which is present at positions −5 to −3 of said genomic target,and/or

(h) combining in a single variant, the mutation(s) at positions 26 to 40and 44 to 77 of two variants from step (e) and step (f), to obtain anovel homodimeric I-CreI variant which cleaves a sequence wherein (i)the nucleotide triplet at positions +3 to +5 is identical to thenucleotide triplet which is present at positions +3 to +5 of saidgenomic target, (ii) the nucleotide triplet at positions −5 to −3 isidentical to the reverse complementary sequence of the nucleotidetriplet which is present at positions +3 to +5 of said genomic target,(iii) the nucleotide triplet at positions +8 to +10 of the I-CreI sitehas been replaced with the nucleotide triplet which is present atpositions +8 to +10 of said genomic target and (iv) the nucleotidetriplet at positions −10 to −8 is identical to the reverse complementarysequence of the nucleotide triplet at positions +8 to +10 of saidgenomic target,

(i) combining the variants obtained in steps (g) and (h) to formheterodimers, and

(j) selecting and/or screening the heterodimers from step (i) which areable to cleave said genomic DNA target from the mouse ROSA26 locus.

One of the step(s) (c), (d), (e) or (f) may be omitted. For example, ifstep (c) is omitted, step (d) is performed with a mutant I-CreI sitewherein both nucleotide triplets at positions −10 to −8 and −5 to −3have been replaced with the nucleotide triplets which are present atpositions −10 to −8 and −5 to −3, respectively of said genomic target,and the nucleotide triplets at positions +3 to +5 and +8 to +10 havebeen replaced with the reverse complementary sequence of the nucleotidetriplets which are present at positions −5 to −3 and −10 to −8,respectively of said genomic target.

Steps (a), (b), (g), (h) and (i) may further comprise the introductionof additional mutations at other positions contacting the DNA targetsequence or interacting directly or indirectly with said DNA target, atpositions which improve the binding and/or cleavage properties of themutants, or at positions which prevent the formation of functionalhomodimers, as defined above. This may be performed by generating acombinatorial library as described in the International PCT ApplicationWO 2004/067736.

The method for engineering I-CreI variants of the inventionadvantageously comprise the introduction of random mutations on thewhole variant or in a part of the variant, in particular the C-terminalhalf of the variant (positions 80 to 163) to improve the binding and/orcleavage properties of the mutants towards the DNA target from the geneof interest. The mutagenesis may be performed by generating randommutagenesis libraries on a pool of variants, according to standardmutagenesis methods which are well-known in the art and commerciallyavailable. Preferably, the mutagenesis is performed on the entiresequence of one monomer of the heterodimer formed in step (i) orobtained in step (j), advantageously on a pool of monomers, preferablyon both monomers of the heterodimer of step (i) or (j).

Preferably, two rounds of selection/screening are performed according tothe process illustrated by FIG. 4 of Arnould et al., J. Mol. Biol., Epub10 May 2007. In the first round, one of the monomers of the heterodimeris mutagenised (monomer Y in FIG. 4), co-expressed with the othermonomer (monomer X in FIG. 4) to form heterodimers, and the improvedmonomers Y⁺ are selected against the target from the gene of interest.In the second round, the other monomer (monomer X) is mutagenised,co-expressed with the improved monomers Y⁺ to form heterodimers, andselected against the target from the gene of interest to obtainmeganucleases (X⁺ Y⁺) with improved activity.

The (intramolecular) combination of mutations in steps (g) and (h) maybe performed by amplifying overlapping fragments comprising each of thetwo subdomains, according to well-known overlapping PCR techniques.

The (intermolecular) combination of the variants in step (i) isperformed by co-expressing one variant from step (g) with one variantfrom step (h), so as to allow the formation of heterodimers. Forexample, host cells may be modified by one or two recombinant expressionvector(s) encoding said variant(s). The cells are then cultured underconditions allowing the expression of the variant(s), so thatheterodimers are formed in the host cells, as described previously inthe International PCT Application WO 2006/097854 and Arnould et al., J.Mol. Biol., 2006, 355, 443-458.

The selection and/or screening in steps (c), (d), (e), (f) and/or (j)may be performed by using a cleavage assay in vitro or in vivo, asdescribed in the International PCT Application WO 2004/067736, Arnouldet al., J. Mol. Biol., 2006, 355, 443-458, Epinat et al., Nucleic AcidsRes., 2003, 31, 2952-2962 and Chames et al., Nucleic Acids Res., 2005,33, e178.

According to another advantageous embodiment of said method, steps (c),(d), (e), (f) and/or (j) are performed in vivo, under conditions wherethe double-strand break in the mutated DNA target sequence which isgenerated by said variant leads to the activation of a positiveselection marker or a reporter gene, or the inactivation of a negativeselection marker or a reporter gene, by recombination-mediated repair ofsaid DNA double-strand break.

The subject matter of the present invention is also an I-CreI varianthaving mutations at positions 26 to 40 and/or 44 to 77 of I-CreI that isuseful for engineering the variants able to cleave a DNA target from themouse ROSA26 locus, according to the present invention. In particular,the invention encompasses the I-CreI variants as defined in step (c) to(f) of the method for engineering I-CreI variants, as defined above,including the variants m1 to m18 (Table II, SEQ ID NO: 38 to 55), thevariant comprising Q44V, R70A and D75N (SEQ ID NO: 131; Table III) andthe variant comprising K28E, Y33R, Q38R, S40R and D75N (SEQ ID NO: 132;Table III). The invention encompasses also the I-CreI variants asdefined in step (g) and (h) of the method for engineering I-CreIvariants, as defined above, including the variants of the sequence SEQID NO:60 to 67 (combined variants of Table III).

Single-chain chimeric meganucleases able to cleave a DNA target from thegene of interest are derived from the variants according to theinvention by methods well-known in the art (Epinat et al., Nucleic AcidsRes., 2003, 31, 2952-62; Chevalier et al., Mol. Cell., 2002, 10,895-905; Steuer et al., Chembiochem., 2004, 5, 206-13; International PCTApplications WO 03/078619 and WO 2004/031346). Any of such methods, maybe applied for constructing single-chain chimeric meganucleases derivedfrom the variants as defined in the present invention.

The polynucleotide sequence(s) encoding the variant as defined in thepresent invention may be prepared by any method known by the man skilledin the art. For example, they are amplified from a cDNA template, bypolymerase chain reaction with specific primers. Preferably the codonsof said cDNA are chosen to favour the expression of said protein in thedesired expression system.

The recombinant vector comprising said polynucleotides may be obtainedand introduced in a host cell by the well-known recombinant DNA andgenetic engineering techniques.

The I-CreI variant or single-chain derivative as defined in the presentthe invention are produced by expressing the polypeptide(s) as definedabove; preferably said polypeptide(s) are expressed or co-expressed (inthe case of the variant only) in a host cell or a transgenicanimal/plant modified by one expression vector or two expression vectors(in the case of the variant only), under conditions suitable for theexpression or co-expression of the polypeptide(s), and the variant orsingle-chain derivative is recovered from the host cell culture or fromthe transgenic animal/plant.

The practice of the present invention will employ, unless otherwiseindicated, conventional techniques of cell biology, cell culture,molecular biology, transgenic biology, microbiology, recombinant DNA,and immunology, which are within the skill of the art. Such techniquesare explained fully in the literature. See, for example, CurrentProtocols in Molecular Biology (Frederick M. AUSUBEL, 2000, Wiley andson Inc, Library of Congress, USA); Molecular Cloning: A LaboratoryManual, Third Edition, (Sambrook et al, 2001, Cold Spring Harbor, N.Y.:Cold Spring Harbor Laboratory Press); Oligonucleotide Synthesis (M. J.Gait ed., 1984); Mullis et al. U.S. Pat. No. 4,683,195; Nucleic AcidHybridization (B. D. Harries & S. J. Higgins eds. 1984); TranscriptionAnd Translation (B. D. Hames & S. J. Higgins eds. 1984); Culture OfAnimal Cells (R. I. Freshney, Alan R. Liss, Inc., 1987); ImmobilizedCells And Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide ToMolecular Cloning (1984); the series, Methods In ENZYMOLOGY (J. Abelsonand M. Simon, eds.-in-chief, Academic Press, Inc., New York),specifically, Vols. 154 and 155 (Wu et al. eds.) and Vol. 185, “GeneExpression Technology” (D. Goeddel, ed.); Gene Transfer Vectors ForMammalian Cells (J. H. Miller and M. P. Calos eds., 1987, Cold SpringHarbor Laboratory); Immunochemical Methods In Cell And Molecular Biology(Mayer and Walker, eds., Academic Press, London, 1987); Handbook OfExperimental Immunology, Volumes I-IV (D. M. Weir and C. C. Blackwell,eds., 1986); and Manipulating the Mouse Embryo, (Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y., 1986).

In addition to the preceding features, the invention further comprisesother features which will emerge from the description which follows,which refers to examples illustrating the I-CreI meganuclease variantsand their uses according to the invention, as well as to the appendeddrawings in which:

FIG. 1 represents the mouse ROSA26 locus (accession number EMBLCQ880114; SEQ ID NO: 3). The Exons are boxed (Exon 1: positions 2490 to2599; Exon 2 from transcript 1: positions 8228 to 9248; Exon 3 fromtranscript 2 starts at position 11845, and largely overlaps with theantisense transcript, which ends art position 11505. The threetranscripts identified so far are indicated as well as the sequence andposition of target rosa1 (SEQ ID NO: 15).

FIG. 2 represents the tridimensional structure of the I-CreI homingendonuclease bound to its DNA target. The catalytic core is surroundedby two αββαββα folds forming a saddle-shaped interaction interface abovethe DNA major groove.

FIG. 3 illustrates a two-step approach to engineer the specificity ofI-CreI and other LAGLIDADG homing endonucleases. A large collection ofI-CreI derivatives is generated by semi-rational mutagenesis of aninitial scaffold and screening for functional variants with locallyaltered specificity. Then, a combinatorial approach is used to assemblethese mutants into meganucleases with fully redesigned specificity.Homodimeric proteins (“half-meganucleases”) are created by combinationsof two sets of mutations within a same αββαββα fold, and theco-expression of two such ‘half-meganucleases” can result in aheterodimeric species (“custom-meganuclease”) cleaving the target ofinterest.

FIG. 4 represents a strategy for the use of a meganuclease cleaving themouse ROSA26 locus. Gene insertion using meganuclease-induced homologousrecombination will knock-in a gene of interest in the mouse ROSA26locus. Introns and exons sequences can be used as homologous regions.

FIG. 5 represents the rosa1 target sequence and derivatives. 10GGG_P,5GAT_P and 5TAT_P are close derivatives found to be cleaved bypreviously obtained I-CreI mutants. They differ from C1221 (palindromicsequence cleaved by the I-CreI scaffold protein) by the boxed motives.C1221, 10GGG_P, 5GAT_P and 5TAT_P were first described as 24 bpsequences, but structural data suggest that only the 22 bp are relevantfor protein/DNA interaction. However, positions ±12 are indicated inparenthesis. rosa1 is the DNA sequence located in the mouse ROSA26 locusat position 8304. In the rosa1.2 target, the GTTC sequence in the middleof the target is replaced with GTAC, the bases found in C1221. rosa1.3is the palindromic sequence derived from the left part of rosa1.2, androsa1.4 is the palindromic sequence derived from the right part ofrosa1.2. As shown in the figure, the boxed motives from 10GGG_P, 5GAT_Pand 5TAT_P are found in the rosa1 series of targets.

FIG. 6 represents the pCLS1055 vector map.

FIG. 7 represents the pCLS0542 vector map.

FIG. 8 illustrates the cleavage of the rosa1.3 DNA target by I-CreImutants. The 63 positives found in primary screen were rearranged in one96-well plate and validated by a secondary screen (in a quadraplicateformat). The 22 mutants chosen in example 2 are circled.

FIG. 9 illustrates the cleavage of the rosa1.4 target by I-CreIcombinatorial mutants. The 69 positives found in primary screen wererearranged in one 96-well plate and validated by a secondary screen (ina quadraplicate format). The 15 chosen mutants in example 3 are circled.

FIG. 10 represents the pCLS1107 vector map.

FIG. 11 illustrates the cleavage of the rosa1.2 and rosa1 targets byheterodimeric I-CreI combinatorial mutants. A. Example of screening ofcombinations of I-CreI mutants with the rosa1.2 target. B. Screening ofthe same combinations of I-CreI mutants with the rosa1 target. B5, B6,D5, D6, F5, F6, H5 and H6: yeast strains expressing rosa1.3 cuttingI-CreI mutants transformed with pCLS1107 empty plasmid DNA.

FIG. 12 illustrates the cleavage of the rosa1 target. A series of 1-CreImutants cutting rosa1.4 were randomly mutagenized and co-expressed witha mutant cutting rosa1.3. Cleavage is tested with the rosa1 target. Ineach four dots cluster, the two dots on the right correspond to one ofthe original heterodimers cleaving rosa1 in duplicate, whereas the towleft dots correspond to a same mutated rosa1.4 cleaver co-expressed witha non mutated rosa1.3 cleaver (mutant m13, described in Tables IV andV). The two optimized mutants displaying improved cleavage of rosa1 arecircled, and correspond to co-expression of mutants m13 and MO_(—)1(C10) or of m13 and MO_(—)2 (E2). MO_(—)1 and MO_(—)2 are furtherdescribed in Table VI.

FIG. 13 illustrates the cleavage of the rosa1 target. A series of 1-CreImutants cutting rosa1.3 were randomly mutagenized and co-expressed witha refined mutant cutting rosa1.4. Cleavage is tested with the rosa1target. Mutants displaying efficient cleavage of rosa1 are circled. Inthe filter:

B11 corresponds to the heterodimer S19, V24, Y44, R68, S70, N75,V77+E28, R33, R38, K40, A44, H68, Q70, A105, R107, A151, G153, E158;

C9 corresponds to the heterodimer S19, V24, Y44, R68, S70, Q75, I77+E28,R33, R38, K40, A44, H68, Q70, A105, R107, A151, G153, E158;

C11 and E8 correspond to the heterodimer V24, Y44, S68, S70, R75, I77,A105+E28, R33, R38, K40, A44, H68, Q70, A105, R107, A151, G153, E158;and

E6 corresponds to the heterodimer V24, Y44, S68, S70, R75, I77, G79+E28,R33, R38, K40, A44, H68, Q70, A105, R107, A151, G153, E158.

H10 is a negative control, H11 and H12 are positive controls ofdifferent intensity. To compare the activity of the heterodimers againstthe rosa1 target before and after the improvement of mutants cutting therosa1.3 target: in each cluster, the two right points are one of theheterodimers described in example 5 and the two left points areheterodimers with mutants described in example 6.

FIG. 14 represents the pCLS1058 vector map.

FIG. 15 represents the pCLS1069 vector map.

FIG. 16 illustrates the cleavage of the rosa1 target by I-CreI refinedmutants in an extrachromosomic model in CHO cells. Values from twotransfection experiments are shown. Cleavage of I-CreI and I-SceItargets by I-CreI N75 and I-SceI in the same experiments are shown aspositive controls.

FIG. 17 represents meganuclease target sequences found in the mouseROSA26 and the corresponding I-CreI variant which is able to cleave eachof said DNA targets. The sequence of the DNA target is presented (column1), with its position (column 2). The minimum repair matrix forrepairing the cleavage at the target site is indicated by its firstnucleotide (start, column 5) and last nucleotide (end, column 6). Thesequence of each variant is defined by the residues at the indicatedpositions. For example, the first heterodimeric variant of FIG. 17consists of a first monomer having K, H, S, S, Q, S, E, C, S, N and I atpositions 28, 30, 32, 33, 38, 40, 44, 68, 70, 75 and 77, respectivelyand a second monomer having K, D, S, R, T, S, K, E, S, D, R at positions28, 30, 32, 33, 38, 40, 44, 68, 70, 75 and 77, respectively. Thepositions are indicated by reference to I-CreI sequence SWISSPROT P05725(SEQ ID NO: 1); I-CreI has K, N, S, Y, Q, S, Q, R, R, D and I, atpositions 28, 30, 32, 33, 38, 40, 44, 68, 70, 75 and 77 respectively.

FIG. 18 represents the pCLS1675 vector map.

FIG. 19 represents the pCLS1761 vector map.

FIG. 20 represents the pCLS1762 vector map.

FIG. 21 illustrates PCR analysis of knock-in (KI) events with theIRES-Hygro matrix (pCLS1675). events with the IRES-Hygro matrix(pCLS1675). Clones wild-type for the ROSA26 locus and clones having arandom insertion of the hygromycin CDS are negatives in PCR. Cloneshaving a KI event at the ROSA26 locus are positives in PCR. Cloneshaving KI event and random insertion are also positives in PCR.

EXAMPLE 1 Strategy for Engineering Novel Meganucleases Cleaving theMouse ROSA26 Locus

The combinatorial approach described in Smith et al., Nucleic AcidsRes., 2006 and illustrated in FIG. 3, was used to engineer the DNAbinding domain of I-CreI, and cleave a 22 bp (non-palindromic) sequencenamed rosa1 (FIG. 5) and located at position 8304, in exon 2 of themouse ROSA26 locus (accession number CQ880114; SEQ ID NO: 3).Meganucleases cleaving the rosa1 sequence could be used to knock-ingenes in the mouse ROSA26 locus (FIG. 4). Applications are in thefollowing fields: production of recombinant proteins in mouse cells,engineering of recombinant cell lines, for example for drug screeningpurpose, and engineering of transgenic mice, for example for use asanimal models.

The rosa1 sequence is partly a patchwork of the 10GGG_P, 5GAT_P and5TAT_P targets (FIG. 5), which are cleaved by previously identifiedmeganucleases, obtained as described in International PCT ApplicationsWO 2006/097784, WO 2006/097853 and WO 2007/049156; Arnould et al., J.Mol. Biol., 2006, 355, 443-458 and Smith et al., Nucleic Acids Res.,Epub 27 Nov. 2006. Thus rosa1 could be cleaved by meganucleasescombining the mutations found in the I-CreI derivatives cleaving thesethree targets.

The 10GGG_P, 5GAT_P and 5TAT_P sequences are 24 bp derivatives of C1221,a palindromic sequence cleaved by I-CreI (International PCT ApplicationsWO 2006/097784, WO 2006/097853 and WO 2007/049156; Arnould et al., J.Mol. Biol., 2006, 355, 443-458 and Smith et al., Nucleic Acids Res.,Epub 27 Nov. 2006). However, the structure of I-CreI bound to its DNAtarget suggests that the two external base pairs of these targets(positions −12 and 12) have no impact on binding and cleavage (Chevalieret al., Nat. Struct. Biol., 2001, 8, 312-316; Chevalier B. S, andStoddard B. L., Nucleic Acids Res., 2001, 29, 3757-3754; Chevalier etal., J. Mol. Biol., 2003, 329, 253-269), and in this study, onlypositions −11 to 11 were considered. Consequently, the rosa1 series oftargets were defined as 22 bp sequences instead of 24 bp.

Rosa1 differs from C1221 in one base pair of the 4 bp central region.According to the structure of the I-CreI protein bound to its target,there is no contact between the 4 central base pairs (positions −2 to 2)and the I-CreI protein (Chevalier et al., Nat. Struct. Biol., 2001, 8,312-316; Chevalier B. S, and Stoddard B. L., Nucleic Acids Res., 2001,29, 3757-3754; Chevalier et al., J. Mol. Biol., 2003, 329, 253-269).Thus, the bases at these positions are not supposed to impact thebinding efficiency. However, they could affect cleavage, which resultsfrom two nicks at the edge of this region. Thus, the GTTC sequence in −2to 2 were first substituted with the GTAC sequence from C1221, resultingin target rosa1.2 (FIG. 5)

Then, two palindromic targets, rosa1.3 and rosa1.4, were derived fromrosa1.2 (FIG. 5). Since rosa1.3 and rosa1.4 are palindromic, they shouldbe cleaved by homodimeric proteins.

Thus proteins able to cleave the rosa1.3 and rosa1.4 sequences ashomodimers, were first designed (examples 2 and 3), and then coexpressedto obtain heterodimers cleaving rosa1 (example 4). Heterodimers cleavingthe rosa1.2 and rosa1 targets could be identified. In order to improvecleavage activity for the rosa1 target, we chose a series of chosenmutants cleaving rosa1.3 and rosa1.4 was then refined; the chosenmutants were randomly mutagenized, and used to form novel heterodimersthat were screened against the rosa1 target (examples 5 and 6). Finally,heterodimers cleaving the rosa1 target could be identified, displaying ahigh cleavage activity in yeast and CHO cells.

EXAMPLE 2 Making of Meganucleases Cleaving Rosa1.3

This example shows that I-CreI mutants can cut the rosa1.3 DNA targetsequence derived from the left part of the rosa1 target in a palindromicform (FIG. 5).

Target sequences described in this example are 22 bp palindromicsequences. Therefore, they will be described only by the first 11nucleotides, followed by the suffix_P. For example, target rosa1.3 willbe noted also caacatgatgt_P; SEQ ID NO: 35)).

The rosa1.3 target is similar to 5GAT_P at positions ±1, ±2, ±3, ±4, ±5,±7, ±9, ±10 and ±11, the two sequences differing only at positions ±6and ±8. It was hypothesized that positions ±6 would have little effecton the binding and cleavage activity. Mutants able to cleave 5GAT_P(caaaacgatgt_P; SEQ ID NO: 32) were previously obtained by mutagenesison I-CreI N75 at positions 24, 44, 68, 70, 75 and 77, as described inArnould et al., J. Mol. Biol., 2006, 355, 443-458 and International PCTApplications WO 2006/097784 and WO 2006/097853. In this example, it waschecked whether mutants cleaving the 5GAT_P target could also cleave therosa1.3 target.

1) Material and Methods

The method for producing meganuclease variants and the assays based oncleavage-induced recombination in mammal or yeast cells, which are usedfor screening variants with altered specificity are described in theInternational PCT Application WO 2004/067736; Epinat et al., NucleicAcids Res., 2003, 31, 2952-2962; Chames et al., Nucleic Acids Res.,2005, 33, e178, and Arnould et al., J. Mol. Biol., 2006, 355, 443-458.These assays result in a functional LacZ reporter gene which can bemonitored by standard methods.

a) Construction of Target Vector

The target was cloned as follow: oligonucleotide corresponding to thetarget sequence flanked by gateway cloning sequence was ordered fromPROLIGO: 5′ 5′ tggcatacaagtttcaacatgatgtacatcatgttgacaatcgtctgtca 3′(SEQID NO: 37). Double-stranded target DNA, generated by PCR amplificationof the single stranded oligonucleotide, was cloned using the Gatewayprotocol (INVITROGEN) into yeast reporter vector (pCLS1055, FIG. 6).Yeast reporter vector was transformed into S. cerevisiae strain FYBL2-7B(MAT α, ura3Δ851, trp1Δ63, leu2Δ1, lys2Δ202).

b) I-CreI Mutants

I-CreI mutants cleaving 5GAT_P were identified in a library wherepositions 24, 44, 68, 70, 75 and 77 of I-CreI are mutated, as describedpreviously in Arnould et al., J. Mol. Biol., 2006, 355, 443-458 andInternational PCT Applications WO 2006/097784 and WO 2006/097853. Theyare cloned in the DNA vector (pCLS0542, FIG. 7) and expressed in theyeast Saccharomyces cerevisiae strain FYC2-6A (MATα, trp1Δ63, leu2Δ1,his3Δ200).

c) Mating of Meganuclease Expressing Clones and Screening in Yeast:

Screening was performed as described previously (Arnould et al., J. Mol.Biol., 2006, 355, 443-458). Mating was performed using a colony gridder(QpixII, Genetix). Mutants were gridded on nylon filters covering YPDplates, using a low gridding density (about 4 spots/cm²). A secondgridding process was performed on the same filters to spot a secondlayer consisting of different reporter-harboring yeast strains for eachtarget. Membranes were placed on solid agar YPD rich medium, andincubated at 30° C. for one night, to allow mating. Next, filters weretransferred to synthetic medium, lacking leucine and tryptophan, withgalactose (2%) as a carbon source, and incubated for five days at 37°C., to select for diploids carrying the expression and target vectors.After 5 days, filters were placed on solid agarose medium with 0.02%X-Gal in 0.5 M sodium phosphate buffer, pH 7.0, 0.1% SDS, 6% dimethylformamide (DMF), 7 mM β-mercaptoethanol, 1% agarose, and incubated at37° C., to monitor β-galactosidase activity. Results were analyzed byscanning and quantification was performed using appropriate software.

d) Sequencing of Mutants

To recover the mutant expressing plasmids, yeast DNA was extracted usingstandard protocols and used to transform E. coli. Sequence of mutant ORFwere then performed on the plasmids by MILLEGEN SA. Alternatively, ORFswere amplified from yeast DNA by PCR (Akada et al., Biotechniques, 2000,28, 668-670), and sequence was performed directly on PCR product byMILLEGEN SA.

2) Results

I-CreI mutants cleaving the 5GAT_P target, previously identified in alibrary where positions 24, 44, 68, 70, 75 and 77 of I-CreI are mutated,were screened for cleavage against the rosa1.3 DNA target(caacatgatgt_P; SEQ ID NO: 35). A total of 63 positive clones werefound, rearranged in a 96-well plate and validated by secondaryscreening (FIG. 8). Among those positive clones, 22 (circled in FIG. 8)were chosen. Those 22 positives clones were sequenced. They turned outto correspond to 18 different novel endonucleases cleaving the rosa1.3target (named m1 to m18: SEQ ID NO: 38 to 55; Table II).

TABLE II I-CreI mutants capable of cleaving the rosa1.3 DNA targetAmino acids at positions 24, 44, 68, 70, 75 and Sequence77 (ex: VYRSYI stands Position on SEQ ID for V24, Y44, R68, S70, FIG. 8Name NO: Y75 and 177) A1 and F3 m1 38 VYRSYI A3 m2 39 VYRSNI A5 and B1m3 40 VYDSRR A9 m4 41 ITYSYR A11 m5 42 VYRSYQ B3, D5 and E6 m6 43 VYYSYRB8 m7 44 VYYSRA B9 m8 45 VYRSNV B10 m9 46 VNYSYR B11 m10 47 VNYSYR +82T* C3 m11 48 VYSSRV C8 m12 49 VYNSRI C11 ml3 50 VYSSRI D6 m14 51VYRSQI D9 m15 52 IYRSNI D12 m16 53 VYYSRV E1 m17 54 VYRSYT E11 m18 55VNSSRV *82T in m10 is an unexpected mutation that may be due to an errorintroduced by the PCR reaction before sequencing of yeast DNA.

EXAMPLE 3 Making of Meganucleases Cleaving Rosa1.4

This example shows that I-CreI mutants can cut the rosa1.4 DNA targetsequence derived from the right part of the rosa1 target in apalindromic form (FIG. 5). All targets sequences described in thisexample are 22 bp palindromic sequences. Therefore, they will bedescribed only by the first 11 nucleotides, followed by the suffix_P.For example, rosa1.4 will be called tgggattatgt_P (SEQ ID NO: 36).

The rosa1.4 target is similar to 5TAT_P at positions ±1, ±2, ±3, ±4, ±5and ±7 and to 10GGG_P at positions ±1, ±2, ±7, ±8, ±9 and ±10. It washypothesized that positions ±6 and ±11 would have little effect on thebinding and cleavage activity. Mutants able to cleave 5TAT_P werepreviously obtained by mutagenesis on I-CreI N75 at positions 44, 68,70, as described in Arnould et al., J. Mol. Biol., 2006, 355, 443-458and International PCT Applications WO 2006/097784 and WO 2006/097853.Mutants able to cleave the 10GGG_P target were obtained by mutagenesison I-CreI N75 at positions 28, 30, 33, 38, 40 and 70, as described inSmith et al., Nucleic Acids Res., Epub 27 Nov. 2006 and InternationalPCT Application WO 2007/049156.

Both sets of proteins are mutated at position 70. However, it washypothesized that two separable functional subdomains exist. Thatimplies that this position has little impact on the specificity towardsthe bases ±8 to 10 of the target.

Therefore, to check whether combined mutants could cleave the rosa1.4target, mutations at positions 44, 68 and 70 from proteins cleaving5TAT_P (caaaactatgt_P; SEQ ID NO: 33) were combined with the 28, 30, 33,38 and 40 mutations from proteins cleaving 10GGG_P (cgggacgtcgt_P; SEQID NO: 31).

1) Material and Methods

The experimental procedures are as described in example 2 and asfollows:

Construction of Combinatorial Mutants

I-CreI mutants cleaving 10GGG_P or 5TAT_P were identified in Smith etal, Nucleic Acids Res. Epub 27 Nov. 2006; International PCT ApplicationWO 2007/049156, and Amould et al., J. Mol. Biol., 2006, 355, 443-458;International PCT Applications WO 2006/097784 and WO 2006/097853,respectively for the 10GGG_P or 5TAT_P targets. In order to generateI-CreI derived coding sequence containing mutations from both series,separate overlapping PCR reactions were carried out that amplify the 5′end (aa positions 1-43) or the 3′ end (positions 39-167) of the I-CreIcoding sequence. For both the 5′ and 3′ end, PCR amplification iscarried out using primers Gal10F 5′-gcaactttagtgctgacacatacagg-3′ (SEQID NO: 56) or Gal10R 5′-acaaccttgattggagacttgacc-3′(SEQ ID NO: 57),specific to the vector (pCLS0542, FIG. 7) and primers assF5′-ctannnttgaccttt-3′ (SEQ ID NO: 58) or assR 5′-aaaggtcaannntag-3′ (SEQID NO: 59) where mm code for residue 40, specific to the I-CreI codingsequence for amino acids 39-43. The PCR fragments resulting from theamplification reaction realized with the same primers and with the samecoding sequence for residue 40 were pooled. Then, each pool of PCRfragments resulting from the reaction with primers Gal10F and assR orassF and Gal10R was mixed in an equimolar ratio. Finally, approximately25 ng of each final pool of the two overlapping PCR fragments and 75 ngof vector DNA (pCLS0542) linearized by digestion with NcoI and EagI wereused to transform the yeast Saccharomyces cerevisiae strain FYC2-6A(MATα, trp1Δ63, leuΔ1, his3Δ200) using a high efficiency LiActransformation protocol (Gietz and Woods, Methods Enzymol., 2002, 350,87-96). An intact coding sequence containing both groups of mutations isgenerated by in vivo homologous recombination in yeast.

2) Results

I-CreI combinatorial mutants were constructed by associating mutationsat positions 44, 68 and 70 with the 28, 30, 33, 38 and 40 mutations onthe I-CreI N75 scaffold, resulting in a library of complexity 2208.Examples of combinatorial mutants are displayed in Table III. Thislibrary was transformed into yeast and 3456 clones (1.5 times thediversity) were screened for cleavage against the rosa1.4 DNA target(tgggattatgt_P; SEQ ID NO: 36). A total of 69 positive clones were foundand were rearranged in a 96-well plate and validated by secondaryscreening (FIG. 9). Among those positives, 15 clones (circled in FIG. 9)were chosen. After sequencing, these 15 clones turned out to correspondto 8 different novel endonucleases cleaving the rosa1.4 DNA target (SEQID NO: 60 to 67; Table III).

TABLE III Cleavage of the rosa1.4 target by the combinatorial variantsAmino acids at positions 28, 30, 33, 38 and 40 (ex: ENRRR stands forE28, N30, R33, R38 and R40) ENRRR ENRRK KNHAS KNHSS KNHQS KNRAT RNRDRAmino acids at positions 44, 68 and 70 AHQ + + (ex: AHQ stands for A44,H68 and Q70) ARN + ARS + VRA + ARG + ASQ + ATN + RAG ANN AQH ARH ARL ARTNRN AQA Only 105 out of the 2208 combinations are displayed). +indicates that a functional combinatorial mutant was found among thesequenced positives.

EXAMPLE 4 Making of Meganucleases Cleaving Rosa1

I-CreI mutants able to cleave each of the palindromic rosa1 derivedtargets (rosa1.3 and rosa1.4) were identified in examples 2 and 3. Pairsof such mutants (one cutting rosa1.3 and one cutting rosa1.4) wereco-expressed in yeast. Upon coexpression, there should be three activemolecular species, two homodimers, and one heterodimer. It was assayedwhether the heterodimers that should be formed cut the non palindromicrosa1 and rosa1.2 DNA targets.

1) Material and Methods a) Cloning of Mutants in Kanamycin ResistantVector

To co-express two I-CreI mutants in yeast, mutants cutting the rosa1.4sequence were subcloned in a yeast expression vector marked with akanamycin resistance gene (pCLS1107, FIG. 10). Mutants were amplified byPCR reaction using primers common for pCLS0542 and pCLS1107: Gal10F5′-gcaactttagtgctgacacatacagg-3′ (SEQ ID NO: 56) and Gal10R5′-acaaccttgattggagacttgacc-3′(SEQ ID NO: 57). Approximately 25 ng ofPCR fragment and 25 ng of vector DNA (pCLS1107) linearized by digestionwith DraIII and NgoMIV are used to transform the yeast Saccharomycescerevisiae strain FYC2-6A (MATα, trp1Δ63, leu2Δ1, his3Δ200) using a highefficiency LiAc transformation protocol. An intact coding sequence forthe I-CreI mutant is generated by in vivo homologous recombination inyeast. Each yeast strain containing a mutant cutting the rosa1.4 targetsubcloned in pCLS1107 vector was then mated with yeast expressing therosa1.4 target to validate it. To recover the mutant expressingplasmids, yeast DNA was extracted using standard protocols and used totransform E. coli. and prepare E. coli DNA.

b) Mutants Coexpression

Yeast strain expressing a mutant cutting the rosa1.3 target in pCLS0542expression vector was transformed with DNA coding for a mutant cuttingthe rosa1.4 target in pCLS1107 expression vector. Transformants wereselected on −L Glu+G418 medium.

c) Mating of Meganucleases Coexpressing Clones and Screening in Yeast

Mating was performed using a colony gridder (QpixII, Genetix). Mutantswere gridded on nylon filters covering YPD plates, using a low griddingdensity (about 4 spots/cm²). A second gridding process was performed onthe same filters to spot a second layer consisting of differentreporter-harbouring yeast strains for each target. Membranes were placedon solid agar YPD rich medium, and incubated at 30° C. for one night, toallow mating. Next, filters were transferred to synthetic medium,lacking leucine and tryptophan, adding G418, with galactose (1%) as acarbon source, and incubated for five days at 37° C., to select fordiploids carrying the expression and target vectors. After 5 days,filters were placed on solid agarose medium with 0.02% X-Gal in 0.5 Msodium phosphate buffer, pH 7.0, 0.1% SDS, 6% dimethyl formamide (DMF),7 mM β-mercaptoethanol, 1% agarose, and incubated at 37° C., to monitorβ-galactosidase activity. Results were analyzed by scanning andquantification was performed using appropriate software.

Results

Coexpression of mutants cleaving the rosa1.3 target (m1 to m18 describedin Table II) and the eight mutants cleaving the rosa1.4 target(described in Table III) resulted in efficient cleavage of the rosa1.2target in all the cases (screen examples are shown in FIG. 11A). Allcombinations tested are summarized in Table IV. Most of thesecombinations are also able to cut the rosa1 natural target that differsfrom the rosa1.2 sequence just by 1 bp at position +1 (FIG. 5). As shownon FIG. 11B, the signal observed on rosa1 natural target is weakcompared to the one observed on rosa1.2 target. The combinationscleaving the rosa1 DNA target are presented in Table V.

TABLE IV Combinations that resulted in cleavage of the rosa1.2 targetMutants cutting rosa1.4 amino acids at positions 28, 30, 33, 38, 40/44,68 and 70 (ex: ENRRR/AHQ stands for E28, N30, R33, R38, R40/A44, H68 andQ70) ENRRR/ ENRRR/ ENRRR/ ENRRK/ ENRRK/ ENRRK/ ENRRK/ ENRRK/ AHQ ARN ASQAHQ ARS VRA ARG ATN Mutants cutting rosa1.3 amino m1VYRSYI + + + + + + + + acids at positions 24, 44, 68, 70, m2VYRSNI + + + + + + + + 75 and 77 m3 VYDSRR + + + + + + + + (ex: VYRSYIstands for V24, m4 ITYSYR + + + + + + + + Y44, R68, S70, Y75 and I77) m5VYRSYQ + + + + + + + + m6 VYYSYR + + + + + + + + m7VYYSRA + + + + + + + + m8 VYRSNV + + + + + + + + m9VNYSYR + + + + + + + + m10 VNYSYR + + + + + + + + + 82T m11VYSSRV + + + + + + + + m12 VYNSRI + + + + + + + + m13VYSSRI + + + + + + + + m14 VYRSQI + + + + + + + + m15IYRSNI + + + + + + + + m16 VYYSRV + + + + + + + + m17VYRSYT + + + + + + + + m18 VNSSRV + + + + + + + + + indicates that theheterodimeric mutant cleaves the rosa1.2 target.

TABLE V Combinations that resulted in cleavage of the rosa1 targetMutants cutting rosa1.4 amino acids at positions 28, 30, 33, 38, 40/44,68 and 70 (ex: ENRRR/AHQ stands for E28, N30, R33, R38, R40/A44, H68 andQ70) ENRRR/ ENRRR/ ENRRR/ ENRRK/ ENRRK/ ENRRK/ ENRRK/ ENRRK/ AHQ ARN ASQAHQ ARS VRA ARG ATN Mutants cutting rosa1.3 m1 VYRSYI amino acids atpositions 24, 44, m2 VYRSNI + + + + + 68, 70, 75 and 77 m3 VYDSRR (ex:VYRSYI stands for V24, m4 ITYSYR Y44, R68, S70, Y75 and I77) m5 VYRSYQm6 VYYSYR + + + + + m7 VYYSRA m8 VYRSNV + + + + + m9 VNYSYR m10 VNYSYR +82T m11 VYSSRV m12 VYNSRI + + + + + + + m13 VYSSRI + + + + + + + m14VYRSQI + + + + + + + m15 IYRSNI m16 VYYSRV + + + + + + + m17VYRSYT + + + + + + + + m18 VNSSRV + indicates that the heterodimericmutant cleaves the rosa1.2 target.

EXAMPLE 5 Refinement of Meganucleases Cleaving Rosa1 by RandomMutagenesis of Proteins Cleaving Rosa1.4 and Assembly with ProteinsCleaving Rosa1.3

I-CreI mutants able to cleave the non palindromic rosa1.2 and rosa1targets were identified by assembly of mutants cleaving the palindromicrosa1.3 and rosa1.4 targets. However, the combinations were able toefficiency cleave rosa1.2 but weakly cleave rosa1, which differs fromrosa1.2 only by 1 bp at position 1. The signal observed on rosa1 is notsufficient.

Therefore protein combinations cleaving rosa1 were mutagenized, andmutants cleaving rosa1 efficiently were screened. According to thestructure of the I-CreI protein bound to its target, there is no contactbetween the 4 central base pairs (positions −2 to 2) and the I-CreIprotein (Chevalier et al., Nat. Struct. Biol., 2001, 8, 312-316;Chevalier B. S, and Stoddard B. L., Nucleic Acids Res., 2001, 29,3757-3754; Chevalier et al., J. Mol. Biol., 2003, 329, 253-269). Thus,it is difficult to rationally choose a set of positions to mutagenize,and mutagenesis was done on the C-terminal part of the protein (83 lastamino acids) or on the whole protein. Random mutagenesis results in highcomplexity libraries, and the complexity of the variants libraries to betested was limited by mutagenizing only one of the two components of theheterodimers cleaving rosa1.

Thus, proteins cleaving rosa1.4 were mutagenized, and it was testedwhether they could efficiency cleave rosa1 when co-expressed withproteins cleaving rosa1.3.

1) Material and Methods a) Random Mutagenesis:

Random mutagenesis libraries were created on pools of chosen mutants, byPCR using Mn²⁺ or derivatives of dNTPs as 8-oxo-dGTP and dPTP intwo-step PCR process as described in the protocol from JENA BIOSCIENCEGmbH in JBS dNTP-Mutageneis kit. For random mutagenesis on the wholeprotein, primers used are: preATGCreFor(5′-gcataaattactatacttctatagacacgcaaacacaaatacacagcggccttgccacc-3′; SEQID NO: 68) and ICreIpostRev(5′-ggctcgaggagctcgtctagaggatcgctcgagttatcagtcggccgc-3′; SEQ ID NO: 69).For random mutagenesis on the C-terminal part of the protein, primersused are: AA78a83For (5′-ttaagcgaaatcaagccg-3′; SEQ ID NO: 70) andICreIpostRev with dNTPs derivatives; the rest of the protein isamplified with a high fidelity taq polymerase and without dNTPsderivatives using primers preATGCreFor and AA78a83Rev(5′-cggcttgatttcgcttaa-3′; SEQ ID NO: 71).

Pools of mutants were amplified by PCR reaction using these primerscommon for pCLS0542 (FIG. 7) and pCLS1107 (FIG. 10). Approximately 75 ngof PCR fragment and 75 ng of vector DNA (pCLS1107) linearized bydigestion with DraIII and NgoMIV are used to transform the yeastSaccharomyces cerevisiae strain FYC2-6A (MATα, trp1Δ63, leu2Δ1,his3Δ200) using a high efficiency LiAc transformation protocol. Alibrary of intact coding sequence for the I-CreI mutant is generated byin vivo homologous recombination in yeast. Positives resulting cloneswere verified by sequencing as described in example 2.

b) Cloning of Mutants in Leucine Expression Vector in the Yeast StrainContaining the Rosa1 Target:

The yeast strain FYBL2-7B (MAT a, ura3Δ851, trp1Δ63, leu2Δ1, lys2Δ202)containing the rosa1 target into yeast reporter vector (pCLS1055, FIG.6) is transformed with mutants cutting rosa1.3 target, in the pCLS0542vector, marked with LEU2 gene, using a high efficiency LiActransformation protocol. The resulting yeast strains are used as targetsfor mating assays as described in example 4.

2) Results

Four mutants cleaving rosa1.4 (ERRR/AHQ, ERRR/ARN, ERRK/AHQ and ERRK/VRAaccording to Table V) were pooled, randomly mutagenized on all proteinsor on the C terminal part of proteins and transformed into yeast. 4464transformed clones were then mated with a yeast strain that (i) containsthe rosa1 target in a reporter plasmid (ii) expresses a variant cleavingthe rosa1.3 target, chosen among those described in example 2. Threesuch strains were used, expressing the I-CreI V24 Y44 S68 S70 R75 I77(or VYSSRI) mutant, the I-CreI V24 Y44 R68 S70 Q75 I77 (or VYRSQI)mutant, or the I-CreI V24 Y44 R68 S70 Y75 T77 (or VYRSYT) mutant (seeTable II). Two clones were found to trigger a better cleavage of therosa1 target when mated with such yeast strain compared to the mutantsbefore mutagenesis with the same yeast strain. In conclusion, twoproteins able to efficiently cleave rosa1 when forming heterodimers withVYSSRI, VYRSQI or VYRSYT (Table VI) were identified. (FIG. 12)

TABLE VI Functional mutant combinations displaying strong cleavageactivity for rosa1 DNA target Optimized mutant rosa1.4* (SEQ ID NO: 72,73) Mutant cutting rosa1.3 VYSSRI MO_1: E28 R33 R38 R40 A44 H68 Q70 N75A105 R107 amino acids at positions (m13) MO_2: E28 R33 R38 K40 A44 H68Q70N75 A105 R107 A151 G153 24, 44, 68, 70, 75 and 77 E158 (ex: VYRSYIstands for V24, VYRSQI MO_1: E28 R33 R38 R40 A44 H68 Q70 N75 A105 R107Y44, R68, S70, Y75 and I77) (m14) MO_2: E28 R33 R38 K40 A44 H68 Q70 N75A105 R107 A151 G153 E158 VYRSYT MO_1: E28 R33 R38 R40 A44 H68 Q70 N75A105 R107 (m17) MO_2: E28 R33 R38 K40 A44 H68 Q70 N75 A105 R107 A151G153 E158 *mutations resulting from random mutagnenesis are in bold.

EXAMPLE 6 Refinement of Meganucleases Cleaving Rosa1 by RandomMutagenesis of Proteins Cleaving Rosa1.3 and Assembly with RefinedProteins Cleaving Rosa1.4

I-CreI mutants able to cleave the rosa1 target were identified byassembly of mutants cleaving rosa1.3 and refined mutants cleavingrosa1.4. To increase the activity of the meganucleases, the secondcomponent of the heterodimers cleaving rosa1 was mutagenized. In thisexample, mutants cleaving rosa1.3 were mutagenized, followed byscreening of more efficient variants cleaving rosa1 in combination withthe refined mutants cleaving rosa1.4 identified in example 5.

1) Material and Method a) Random Mutagenesis:

Random mutagenesis libraries were created on pools of chosen mutants, byPCR using Mn²⁺ or derivatives of dNTPs as 8-oxo-dGTP and dPTP intwo-step PCR process as described in the protocol from JENA BIOSCIENCEGmbH in JBS dNTP-Mutageneis kit. For random mutagenesis on the wholeprotein, primers used are: preATGCreFor(5′-gcataaattactatacttctatagacacgcaaacacaaatacacagcggccttgccacc-3′; SEQID NO: 68) and ICreIpostRev(5′-ggctcgaggagctcgtctagaggatcgctcgagttatcagtcggccgc-3′; SEQ ID NO: 69).For random mutagenesis on the C-terminal part of the protein primer usedare AA78a83For (5′-ttaagcgaaatcaagccg-3′; SEQ ID NO: 70) andICreIpostRev with dNTPs derivatives; the rest of the protein isamplified with a high fidelity taq polymerase and without dNTPsderivatives using primers preATGCreFor and AA78a83Rev(5′-cggcttgatttcgcttaa-3′; SEQ ID NO: 71).

Pools of mutants were amplified by PCR reaction using these primerscommon for pCLS0542 (FIG. 7) and pCLS1107 (FIG. 10). Approximately 75 ngof PCR fragment and 75 ng of vector DNA (pCLS0542) linearized bydigestion with NcoI and EagI are used to transform the yeastSaccharomyces cerevisiae strain FYC2-6A (MATα, trp1Δ63, leu2Δ1,his3Δ200) using a high efficiency LiAc transformation protocol. Alibrary of intact coding sequence for the I-CreI mutant is generated byin vivo homologous recombination in yeast. Positives resulting cloneswere verified by sequencing as described in example 2.

b) Cloning of Mutants in Kanamycin Expression Vector in the Yeast StrainContaining the Rosa1 Target

The yeast strain FYBL2-7B (MAT a, ura3Δ851, trp1Δ63, leu2Δ1, lys2Δ202)containing the rosa1 target into yeast reporter vector (pCLS1055, FIG.6) is transformed with MO_(—)1 and MO_(—)2 refined mutants, cuttingrosa1.4 target, in pCLS1107 vector, using a high efficiency LiActransformation protocol. Mutant-target yeasts are used as targets formating assays as described in example 4.

2) Results

Two pools of four mutants cleaving rosa1.3 (pool 1: VYRSNI, VYYSYR,VYRSNV and VYNSRI and pool 2: VYYSYR, VYSSRI, VYRSQI and VYRSYTaccording to Table V) were randomly mutagenized on all proteins or onthe C terminal part of proteins and transformed into yeast. 8928transformed clones were then mated with a yeast strain that (i) containsthe rosa1 target in a reporter plasmid (ii) expresses a variant cleavingthe rosa1.4 target. Two such strains were used expressing either theI-CreI E28 R33 R38 R40 A44 H68 Q70 N75 A105 R107 (or MO_(—)1) mutant,either the I-CreI E28 R33 R38 K40 A44 H68 Q70 N75 A105 R107 A151 G153E158 (or MO_(—)2) mutant. Five clones were found to trigger a bettercleavage of the rosa1 target when mated with such yeast strain comparedto the mutants before mutagenesis with the same yeast strain (FIG. 13).After sequencing they turn out to correspond to four proteins. Inconclusion, four proteins able to efficiently cleave rosa1 when formingheterodimers with MO_(—)1 or MO_(—)2, were identified (Table VII).

TABLE VII Functional mutant combinations displaying strong cleavageactivity for rosa1 DNA target. Optimized mutant rosa1.3* (SEQ ID NO: 74to 77) Optimized MO_1 mO_1: S19 V24 Y44 R68 S70 N75 V77 mutant E28 R33R38 mO_2: S19 V24 Y44 R68 S70 Q75 I77 rosa1.4 R40 A44 mO_3: V24 Y44 S68S70 R75 I77 A105 H68 Q70 N75 mO_4: V24 Y44 S68 S70 R75 I77 G79 A105 R107MO_2 mO_1: S19 V24 Y44 R68 S70 N75 V77 E28 R33 R38 mO_2: S19 V24 Y44 R68S70 Q75 I77 K40 A44 H68 mO_3: V24 Y44 S68 S70 R75 I77 A105 Q70 N75 A105mO_4: V24 Y44 S68 S70 R75 I77 G79 R107 A151 G153 E158 *mutationsresulting from random mutagenesis are in bold

EXAMPLE 7 Validation of Rosa1 Target Cleavage in an ExtrachromosomicModel in CHO Cells

In example 6, I-CreI refined mutants able to efficiently cleave therosa1 target in yeast were identified. In this example, the ability oftwo combinations of mutants to cut the rosa1 target in CHO cells wastested using an extrachromosomal essay in mammalian cells.

1) Materials and Methods a) Cloning of Rosa1 Target in a Vector for CHOScreen

The target was cloned as follow: oligonucleotide corresponding to thetarget sequence flanked by gateway cloning sequence was ordered fromProligo: 5′ tggcatacaagtttcaacatgatgtacatcatgttgacaatcgtctgtca 3′ (SEQID NO: 37). Double-stranded target DNA, generated by PCR amplificationof the single stranded oligonucleotide, was cloned using the Gatewayprotocol (INVITROGEN) into CHO reporter vector (pCLS1058, FIG. 14).

b) Re-Cloning of Meganucleases

The ORF of I-CreI N75, I-SceI and I-CreI mutants identified in example 6were amplified by PCR and sequenced (MILLEGEN). Then, ORFs were reclonedusing the Gateway protocol (INVITROGEN). ORFs were amplified by PCR ofyeast DNA using the primers B1F: 5′ggggacaagtttgtacaaaaaagcaggatcgaaggagatagaaccatggccaataccaaatataacaaagagttcc3′ (SEQ ID NO: 78) and B2R: 5′ggggaccactttgtacaagaaagctgggtttagtcggccgccggggaggatttcttcttctcgc 3′ (SEQID NO: 79) from Proligo. PCR products were cloned in CHO gatewayexpression vector pcDNA6.2 from Invitrogen (pCLS1069, FIG. 15).Resulting clones were verified by sequencing as described in example 2.

c) Extrachromosomal Assay in Mammalian Cells

CHO cells were transfected with Polyfect transfection reagent accordingto the supplier's (QIAGEN) protocol. 72 hours after transfection,culture medium was removed and 150 μl of lysis/revelation buffer addedfor β-galactosidase liquid assay (typically, 1 liter of buffer contains100 ml of lysis buffer (Tris-HCl 10 mM pH 7.5, NaCl 150 mM, Triton X1000.1%, BSA 0.1 mg/ml, protease inhibitors), 10 ml of Mg 100× buffer(MgCl₂ 100 mM, β-mercaptoethanol 35%), 110 ml ONPG 8 mg/ml and 780 ml ofsodium phosphate 0.1 M pH7.5). After incubation at 37° C., opticaldensity was measured at 420 nm. The entire process is performed on anautomated BioCel® platform (VELOCITY11).

2) Results

The results of two experiments presented in FIG. 16, show that twocombinations of I-CreI mutants (mO_(—)2/MO_(—)1 and mO_(—)2/MO_(—)2) areable to cut the rosa1 target in CHO cells with an activity similar asthe activity of I-CreI N75 against the I-CreI target(tcaaaacgtcgtgagacagtttgg, SEQ ID NO: 80) or I-SceI against the I-SceItarget (tagggataacagggtaat, SEQ ID NO: 81).

EXAMPLE 8 Genome Engineering at the ROSA26 Locus in Mouse Cells

I-CreI refined mutants able to efficiently cleave the rosa1 target inyeast and in an extrachromosomal assay in mammalian cells (CHO K1 cells)have been identified in examples 6 and 7. The ability of one combinationof two I-CreI refined mutants to induce homologous recombination at theROSA26 locus in mouse L cells was tested in this example.

1) Materials and Methods a) Knock-In (KI) Matrices

Two knock-in matrices comprising the hygromycin resistance gene codingsequence (CDS) cloned between two mouse ROSA26 homology arms, HG ROSA26from 6283 to 8317 and HD ROSA26 from 8313 to 10319 in CQ880114 sequence(corresponding to SEQ ID NO: 3 in the sequence listing), wereconstructed. The resulting plasmids are pCLS1679 and pCLS1675 (plasmidmap in FIG. 18). In pCLS1679, the coding sequence of the hygromycinresistance gene (hygro CDS) operatively linked to the SV40 polyA wascloned in pBR322 vector (PROMEGA) between HG ROSA26 and HD ROSA26.pCLS1675 differs from pCLS1679 by the insertion of an Internal RibosomalEntry site (IRES; SEQ ID NO: 139) just upstream of the hygro CDS.

b) Cloning of Meganucleases

The ORF of I-CreI refined mutants mO_(—)2 and MO_(—)1 are described inexample 6 (Table VII). Mutants expression was made in two expressionvectors under the control of the human elongation factor 1 alpha (EF1α)promoter or cytomegalovirus immediate early (CMV) promoter (pCLS1069,FIG. 15). Mutants were cloned in pCLS1069 under CMV promoter asdescribed in example 7. The resulting plasmids were verified bysequencing (MILLEGEN). In pCLS1761 (FIG. 19) and pCLS1762 (FIG. 20), themO_(—)2 and MO_(—)1 I-CreI mutants, respectively, are under the controlof the EF1α promoter.

c) Knock-In Experiment in Mouse L Cells

Mouse L cells (ATCC # CRL-2648) are cultivated in complete DMEM medium(DMEM Glutamax, GIBCO) supplemented with 10% fetal calf serum,penicillin, streptomycin and fungizon. Cells are transfected usinglipofectamin reagent (INVITROGEN) according to the procedure recommendedby the manufacturer. Two days after transfection, selection is performedusing Hygromycin at 0.6 mg/ml in complete medium. After two weeks ofselection, resistant clones are picked using a ClonePix robot (GENETIX).Clones are amplified one week in 96 wells plates in complete mediumsupplemented with hygromycin at 0.6 mg/ml. Genomic DNA is extracted fromresistant clones cultured in 96 well plates using the ZR96 kit (ZYMORESEARCH).

c) PCR Analysis of Knock-In Events

Knock-in events are detected by PCR analysis on genomic DNA using thepair of primers KI_GHG_S5 (5′ tagtatacagaaactgttgcatcgc 3′; SEQ ID NO:137) and HygSeqRev (5′ cgtctgctgctccatacaag 3′; SEQ ID NO: 138), locatedrespectively in the mouse ROSA26 sequence upstream of the HG ROSA26homology arm and in the hygromycin CDS, to obtain a KI specific PCRamplification (FIG. 21).

2) Results

ROSA26 meganucleases used in this example are mO_(—)2 and MO_(—)1described in example 6 (Table VII) and cloned in two expression vectors,under the control of the human elongation factor 1 alpha (EF1α) promoter(pCLS1761 and pCLS1762) or cytomegalovirus immediate early (CMV)promoter (pCLS1069, FIG. 15). Mouse L cells were cotransfected withthree vectors: two plasmids expressing the mO_(—)2 and MO_(—)1 ROSA26meganucleases and the KI matrix. The meganucleases were cloned inpCLS1761 and pCLS1762, respectively EF1α promoter and the KI matrix waspCLS1675.

A total of 2600000 mouse L cells were cotransfected with 2 μg of KImatrix vector and 5 μg or 10 μg of each meganuclease expression vector.As control of spontaneous KI frequency, the same number of cells wastransfected with 2 μg of KI matrix vector alone. The transfectionefficacy (40%) was determined by FACS analysing using a fluorescentmarker expressing plasmid. The frequency of resistant clones wasdetermined by counting the total number of hygromycin resistant clonesand corrected by transfection efficacy. 2605, 1197 and 1902 hygromycinresistant clones were obtained, respectively (Table VIII). 92 or 184clones were picked per condition and analysed by PCR as described inmaterials and methods. Results are presented in Table VIII.

TABLE VIII PCR result and frequency of KI events at the ROSA26 locus inmouse L cells Total Number of number PCR positives/ of CorrectedHygro^(R) Vectors Hygro^(R) Hygro^(R) picked KI events transfectedclones frequency clones frequency  2 μg pCLS1675 2605 2.5 × 10⁻³ 18/924.9 × 10⁻⁴  5 μg pCLS1761  5 μg pCLS1762  2 μg pCLS1675 1197 1.1 × 10⁻³28/92 3.5 × 10⁻⁴ 10 μg pCLS1761 10 μg pCLS1762  2 μg pCLS1675 1902 1.8 ×10⁻³  0/184 0

Cotransfection of ROSA26 meganucleases and KI matrix induced homologousrecombination at the mouse ROSA26 locus in L cells at a maximalfrequency of 4.9×10⁻⁴. No spontaneous homologous recombination wasobserved with transfection of the KI matrix alone. This exampleillustrates the ability of ROSA26 meganucleases to induce homologousrecombination at the mouse ROSA26 locus in mouse L cells.

EXAMPLE 9 Meganucleases Derived from mO_(—)2 and MO_(—)1

Meganuclease constructs were engineered from mO_(—)2 (SEQ ID NO: 75) andMO_(—)1 (SEQ ID NO: 72) by using conventional techniques of molecularbiology and recombinant DNA, which are explained fully in CurrentProtocols in Molecular Biology (Frederick M. AUSUBEL, 2000, Wiley andson Inc, Library of Congress, USA); Molecular Cloning: A LaboratoryManual, Third Edition, (Sambrook et al, 2001, Cold Spring Harbor, N.Y.:Cold Spring Harbor Laboratory Press).

A NLS (KKKRK; SEQ ID NO: 134) was inserted between the first (M₁) andthe second (A₂) amino acids of MO_(—)1 and mO_(—)2; the resultingvariants are SEQ ID NO: 140 and 141, respectively.

A tag (TagHA; YPYDVPDYA; SEQ ID NO: 135) was inserted between the first(M₁) and the third amino acid (N₃) of mO_(—)2; the resulting variant isSEQ ID NO: 142.

A tag (STag; KETAAAKFERQHMDS; SEQ ID NO: 136) was inserted between thefirst (M₁) and the second (A₂) amino acids of MO_(—)1; the resultingvariant is SEQ ID NO: 143.

A tag (TagHA; YPYDVPDYA; SEQ ID NO: 135) and a NLS (KKKRK; SEQ ID NO:134) were inserted between the first (M₁) and the second amino acid (A₂)of mO_(—)2; the resulting variant is SEQ ID NO: 144.

A tag (STag; KETAAAKFERQHMDS; SEQ ID NO: 136) and a NLS (KKKRK; SEQ IDNO: 134) were inserted between the first (M₁) and the second (A₂) aminoacids of MO_(—)1; the resulting variant is SEQ ID NO: 145.

A single-chain meganuclease comprising an MO_(—)1 monomer (positions 1to 166 of SEQ ID NO: 72) separated from a mO_(—)2 monomer (positions 3to 164 of SEQ ID NO: 75) by a linker (GGSDKYNQALSKYNQALSKYNQALSGGGGS;SEQ ID NO: 149) was constructed: the resulting single-chain meganucleaseis SEQ ID NO: 146.

An obligate heterodimer derived from mO_(—)2/MO_(—)1 was engineered byintroducing the E8K and E61R mutations in a mO_(—)2 monomer and the K7Eand K96E mutations in a MO_(—)1 monomer; the resulting heterodimerconsists of SEQ ID NO: 147 and SEQ ID NO: 148.

1-43. (canceled)
 44. A method of cleaving a DNA target sequence from amouse ROSA26 locus comprising contacting said DNA target sequence withan I-CreI variant to thereby cleave said DNA target sequence whereinsaid I-CreI variant comprises a first monomer and a second monomer whichare associated to form an active form, wherein said I-CreI variantcomprises at least two substitutions in at least one of the monomers,wherein at least one substitution is of a residue in the range ofpositions 26 to 40 of I-CreI and at least one substitution is of aresidue in the range of positions 44 to 77 of I-CreI and wherein saidDNA target sequence is at least one sequence selected from the groupconsisting of SEQ ID NO: 5 to 14 and 16 to
 30. 45. The method of claim44, wherein said at least one substitution of a residue in the range of26 to 40 of I-CreI is at least one substitution of a residue selectedfrom the group consisting of positions 26, 28, 30, 32, 33, 38 and 40.46. The method of claim 44, wherein said at least one substitution of aresidue in the range of 44 to 77 of I-CreI is at least one substitutionof a residue selected from the group consisting of positions 44, 68, 70,75 and
 77. 47. The method of claim 44, wherein said substitutionscomprise replacing the wild-type amino acids with an amino acid selectedfrom the group consisting of A, D, E, G, H, K, N, P, Q, R, S, T, Y, C,W, L and V.
 48. The method of claim 44, wherein said I-CreI variantfurther comprises at least one substitution selected from the groupconsisting of: G19S, I24V, S79G, V105A, K107R, V151A, D153G and K158E.49. The method of claim 44, wherein said I-CreI variant furthercomprises at least one substitution at positions 137 to 143 of I-CreI.50. The method of claim 44, which comprises substitution of the asparticacid in position 75 of I-CreI.
 51. The method of claim 50, whereinposition 75 of I-CreI is substituted with an asparagine residue.
 52. Themethod of claim 44, wherein said variant is a homodimer.
 53. The methodof claim 44, wherein said variant is a heterodimer, resulting from theassociation of a first and a second monomer having different mutationsin positions 26 to 40 and 44 to 77 of I-CreI.
 54. The method of claim53, wherein the first and the second monomer of said I-CreI variant,respectively, comprise at least one pair of substitutions selected fromthe group consisting of: N30H, Y33S, Q44E, R68c, R70S, D75N and N30D,Y33R, Q38T, Q44K, R68E, R70S, I77R, S32N, Y33G, Q44K, R70E, D75N andS32T, Q38W, Q44K, R68E, R70S, I77R, Y33R, Q38N, S40Q, Q44N, R70S, D75R,I77D and N30H, Y33S, Q44A, R70S, D75Q, I77E, K28S, Q38R, S40K, Q44D,R68Y, R70S, D75S, I77R and Y33C, Q38A, R68A, R70K, D75N, Y33C, Q44T,R70S, D75Y and S32D, Q38C, Q44D, R68Y, R70S, D75S, I77R, S32T, Y33C,R68T, R70N, D75N and S32T, Q38W, Q44K, R70E, D75N, R70S, D75R, I77Y andY33R, Q38A, S40Q, Q44A, R70S, D75N, K28S, Q38R, S40K, Q44T, R68N, R70N,D75N and Y33H, Q38S, Q44K, R68Y, R70S, D75Q, I77N, K28A, Y33S, Q38R,S40K, Q44N, R68Y, R70S, D75R, I77V and S32T, Y33C, Q44A, R70S, D75N,S32D, Y33H, Q44K, R68E, R70S, I77R and S32D, Y33H, Q44D, R68N, R70S,D75N, N30R, S32D, R68S, R70K, D75N and D75N, K28R, Y33A, Q38Y, S40Q,R68Y, R70S, D75R, I77Q and Y33R, Q38A, S40Q, R70S, I77K, K28R, N30D,D75E, I77R and S32D, Q38C, Q44A, R70S, D75R, I77Y, Y33R, Q38A, S40Q,R70S, D75N and R70S, D75Y, I77R, Y33R, Q38A, S40Q, Q44N, R68Y, R70S,D75R, I77V and N30R, S32D, Q44T, R68H, R70H, D75N, Y33P, S40Q, Q44K,R68Y, R70S, D75Q, I77N and S32A, Y33C, R68Y, R70S, D75R, I77Q, K28A,Y33S, Q38R, S40K, R68N, R70S, D75N, I77R and S32N, Y33G, Q44A, R68A,R70K, D75N, N30H, Y33S, Q44Y, R70S, I77V and Y33R, S40Q, Q44A, R70S,D75N, S32D, Y33H, R68T, R70N, D75N and N30R, S32D, Q44T, R68N, R70N,D75N, S32R, Y33D, Q44A, R70S, D75N and Y33S, Q38R, S40H, R68H, R70H,D75N, Y33P, S40Q, Q44K, R68Y, R70S, D75Q, I77N and K28R, Y33A, Q38Y,S40Q, Q44T, R68H, R70H, D75N, K28Q, Q38R, S40K, Q44A, R70S, D75E, I77Rand N30R, S32D, Q44K, R68Y, R70S, D75Q, I77N, Y33T, Q38A, R68H, R70H,D75N and N30H, Y33S, R70S, I77K, Y33T, S40T, R68A, R70K, D75N and K28R,Y33A, Q38Y, S40Q, R70S, and S32D, Y33H, Q44N, R70S, D75R, I77D and S32D,Q38C, R70S, I77K.
 55. The method of claim 53, wherein the first monomeris selected from the group consisting of SEQ ID NO: 82 to 106 and thesecond monomer is selected from the group consisting of SEQ ID NO: 107to 116, 4, and 117 to
 130. 56. The method of claim 53, wherein thetarget sequence is SEQ ID NO:
 15. 57. The method of claim 56, whereinthe first monomer has substitutions selected from the group consistingof: I24V, Q44Y, R70S and D75N; I24V, Q44Y, R68Y, R70S, D75Y and I77R;I24V, Q44Y, R70S, D75N and I77V; I24V, Q44Y, R68N, R70S and D75R; I24V,Q44Y, R68S, R70S and D75R; I24V, Q44Y, R70S and D75Q; I24V, Q44Y, R68Y,R70S, D75R and I77V; I24V, Q44Y, R70S, D75Y and I77T, and the secondmonomer has substitutions selected from the group consisting of: K28E,Y33R, Q38R, S40R, Q44A, R68H, R70Q and D75N; K28E, Y33R, Q38R, S40R,Q44A, R70N and D75N; K28E, Y33R, Q38R, S40K, Q44A, R68H, R70Q and D75N;K28E, Y33R, Q38R, S40K, Q44V, R70A and D75N; K28E, Y33R, Q38R, S40K,Q44A, R70G and D75N.
 58. The method of claim 57, wherein the firstmonomer is selected from the group consisting of SEQ ID NO: 39, 43, 45,49, 50, 51, 53 and 54, and the second monomer is selected from the groupconsisting of SEQ ID NO: 60, 61, 63, 65 and
 66. 59. The method of claim58, wherein at least one of the two I-CreI monomers has at least 95%sequence identity with, for the first monomer, SEQ ID NO: 39, 43, 45,49, 50, 51, 53 or 54, and, for the second monomer, SEQ ID NO: 60, 61,63, 65 or
 66. 60. The method of claim 56, wherein the first and thesecond monomer, respectively, comprise at least the followingsubstitutions: I24V, Q44Y, R70S, D75Y, I77T and K28E, Y33R, Q38R, S40R,Q44A, R68S, R70Q, D75N.
 61. The method of claim 60, wherein the firstand the second monomers are SEQ ID NO:54 and SEQ ID NO: 62,respectively.
 62. The method of claim 61, wherein at least one of thetwo I-CreI monomers has at least 95% sequence identity with, for thefirst monomer, SEQ ID NO: 54, and, for the second monomer, SEQ ID NO:62.
 63. The method of claim 56, wherein the first monomer hassubstitutions selected from the group consisting of: I24V, Q44Y, R68N,R70S and D75R; I24V, Q44Y, R68S, R70S and D75R; I24V, Q44Y, R70S andD75Q; I24V, Q44Y, R68Y, R70S, D75R and I77V; I24V, Q44Y, R70S, D75Y andI77T, and the second monomer has substitutions selected from the groupconsisting of: K28E, Y33R, Q38R, S40K, Q44A, R70S, D75N and K28E, Y33R,Q38R, S40K, Q44A, R68T, R70N, D75N.
 64. The method of claim 63, whereinthe first monomer is selected from the group consisting of the sequencesSEQ ID NO: 49, 50, 51, 53 and 54, and the second monomer is selectedfrom the group consisting of the sequences SEQ ID NO: 64 and
 67. 65. Themethod of claim 64, wherein at least one of the two I-CreI monomers hasat least 95% sequence identity with, for the first monomer, SEQ ID NO:49, 50, 51, 53 or 54, and, for the second monomer, SEQ ID NO: 64 or 67.66. The method of claim 56, wherein the first monomer is selected fromthe group consisting of the sequences SEQ ID NO: 72 and 73 and thesecond monomer is selected from the group consisting of the sequencesSEQ ID NO: 74 to
 77. 67. The method of claim 66, wherein at least one ofthe two I-CreI monomers has at least 95% sequence identity with, for thefirst monomer, SEQ ID NO: 72 or 73, and, for the second monomer, SEQ IDNO: 74, 75, 76, or
 77. 68. The method of claim 53, wherein the firstmonomer further comprises the D137R mutation and the second monomerfurther comprises the R51D mutation.
 69. The method of claim 53, whereinthe first monomer further comprises the E8R or E8K and E61R mutationsand the second monomer further comprises the K7E and K96E mutations. 70.The method of claim 44, wherein said variant is a single-chain chimericmeganuclease comprising two I-CreI monomers.
 71. The method of claim 70,wherein said chimeric meganuclease comprises a first monomer and asecond monomer wherein each monomer has the same substitutions.
 72. Themethod of claim 70, wherein said chimeric meganuclease comprises a firstmonomer and a second monomer wherein each monomer has at least onedifferent substitution in positions 26 to 40 and 44 to 77 of I-CreI. 73.The method of claim 44, wherein said I-CreI variant is made from thestarting scaffold of SEQ ID NO:
 1. 74. The method of claim 44, whereinsaid I-CreI variant is made from the starting scaffold of SEQ ID NO:133.
 75. The method of claim 44, wherein said I-CreI variant is madefrom the starting scaffold of SEQ ID NO:
 4. 76. The method of claim 44wherein said contacting is in a cell.
 77. The method of claim 44 whereinsaid I-CreI variant is expressed in a cell from a polynucleotideencoding said I-CreI variant.